Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run Zebra's unit tests with the scanner enabled #8037

Closed
2 tasks done
Tracked by #7728
teor2345 opened this issue Nov 30, 2023 · 18 comments · Fixed by #8039
Closed
2 tasks done
Tracked by #7728

Run Zebra's unit tests with the scanner enabled #8037

teor2345 opened this issue Nov 30, 2023 · 18 comments · Fixed by #8039
Assignees
Labels
A-blockchain-scanner Area: Blockchain scanner of shielded transactions A-devops Area: Pipelines, CI/CD and Dockerfiles C-testing Category: These are tests

Comments

@teor2345
Copy link
Contributor

teor2345 commented Nov 30, 2023

Motivation

We're not running the shielded scanner tests in CI, leading to bugs like:
9a7848e

Required Changes

if [[ -n "${{ vars.RUST_EXPERIMENTAL_FEATURES }}" && "${{ vars.RUST_EXPERIMENTAL_FEATURES }}" != " " ]]; then
docker run -e NETWORK --name zebrad-tests-experimental --tty ${{ vars.GAR_BASE }}/${{ vars.CI_IMAGE_NAME }}@${{ needs.build.outputs.image_digest }} cargo test --locked --release --features "${{ env.EXPERIMENTAL_FEATURES }} " --workspace -- --include-ignored
fi

@teor2345 teor2345 self-assigned this Nov 30, 2023
@mpguerra mpguerra added this to Zebra Nov 30, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in Zebra Nov 30, 2023
@teor2345 teor2345 added A-devops Area: Pipelines, CI/CD and Dockerfiles P-High 🔥 C-testing Category: These are tests A-blockchain-scanner Area: Blockchain scanner of shielded transactions labels Nov 30, 2023
@teor2345 teor2345 changed the title Run Zebra's unit tests with the scanner enabled, by adding shielded-scan to the RUST_EXPERIMENTAL_FEATURES GitHub repository variable Run Zebra's unit tests with the scanner enabled Nov 30, 2023
@teor2345
Copy link
Contributor Author

Let's see how it goes:
Screenshot 2023-12-01 at 07 57 33

@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Zebra Nov 30, 2023
@teor2345
Copy link
Contributor Author

Instructions to revert:

  1. Get a repository or organisation admin
  2. Go to https://github.com/ZcashFoundation/zebra/settings/variables/actions
  3. Change the value of RUST_EXPERIMENTAL_FEATURES to (a single space)

@teor2345
Copy link
Contributor Author

It looks like this is going to take a significant number of changes:

failures:
config_tests
downgrade_state_format
end_of_support_is_checked_at_start
ephemeral_existing_directory
ephemeral_missing_directory
misconfigured_ephemeral_existing_directory
misconfigured_ephemeral_missing_directory
non_blocking_logger
persistent_mode
restart_stop_at_height
start_args
start_no_args
sync_large_checkpoints_mainnet
sync_large_checkpoints_mempool_mainnet
sync_one_checkpoint_mainnet
update_state_format
zebra_rpc_conflict
zebra_state_conflict
zebra_zcash_listener_conflict

https://github.com/ZcashFoundation/zebra/actions/runs/7052897741/job/19199392237?pr=8038#step:4:38827

So I'll revert the experimental features variable for now.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 1, 2023

Some of these failures could be caused by a panic in one test, let's try to fix the config test first.

@mpguerra
Copy link
Contributor

mpguerra commented Dec 1, 2023

Hey team! Please add your planning poker estimate with Zenhub @arya2 @oxarbitrage @teor2345 @upbqdn

@oxarbitrage
Copy link
Contributor

The config test will fail because we don't have the right version stored but there is also a lot of:

   Message:  Opening database "/home/alfredo/.cache/zebra/private-scan/v0/mainnet" failed: Error { message: "IO error: While lock file: /home/alfredo/.cache/zebra/private-scan/v0/mainnet/LOCK: Resource temporarily unavailable" }. Hint: Check if another zebrad process is running. Try changing the state cache_dir in the Zebra config.

Meaning i think the scanner database is not closing properly.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 5, 2023

The config test will fail because we don't have the right version stored

And we need to check that version parses, but we can only do that when the shielded-scan feature is enabled.

but there is also a lot of:

   Message:  Opening database "/home/alfredo/.cache/zebra/private-scan/v0/mainnet" failed: Error { message: "IO error: While lock file: /home/alfredo/.cache/zebra/private-scan/v0/mainnet/LOCK: Resource temporarily unavailable" }. Hint: Check if another zebrad process is running. Try changing the state cache_dir in the Zebra config.

Meaning i think the scanner database is not closing properly.

We need to shut down the database in a storage Drop impl, like we do in the FinalizedState. Let me know if you want help with that.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 5, 2023

but there is also a lot of:

   Message:  Opening database "/home/alfredo/.cache/zebra/private-scan/v0/mainnet" failed: Error { message: "IO error: While lock file: /home/alfredo/.cache/zebra/private-scan/v0/mainnet/LOCK: Resource temporarily unavailable" }. Hint: Check if another zebrad process is running. Try changing the state cache_dir in the Zebra config.

Meaning i think the scanner database is not closing properly.

We need to shut down the database in a storage Drop impl, like we do in the FonalizedState. Let me know if you want help with that.

Actually shutdown is being correctly called:

impl Drop for ZebraDb {
fn drop(&mut self) {
self.shutdown(false);
}
}

It looks like all the tests are using the default production scanner db path, but they should be using the ephemeral config. That can be fixed here:

pub fn default_test_config(net: Network) -> Result<ZebradConfig> {

@oxarbitrage
Copy link
Contributor

oxarbitrage commented Dec 5, 2023

Yes, i am trying to make this work. I already fixed the ephemeral but the error still happens.

Failures are happening only when all the tests are run together, they all pass individually.

@oxarbitrage
Copy link
Contributor

oxarbitrage commented Dec 5, 2023

I am moving to a (maybe temporal) solution that involves not starting the task if the feature is enabled and sapling_keys_to_scan is empty. This will make scan_task_starts the only test that will open a scanner database. I am still trying it but i think this will work, for now.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 5, 2023

I am moving to a (maybe temporal) solution that involves not starting the task if the feature is enabled and sapling_keys_to_scan is empty. This will make scan_task_starts the only test that will open a scanner database. I am still trying it but i think this will work, for now.

That might be ok for a temporary solution, but I think we will need the actual fix before we release the MVP. (Because it could be a bug in the production config handling.)

I am happy to look at your PR with the ephemeral fix.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 5, 2023

I also think it will help to merge this PR before we do much testing in CI, because it makes CI errors much easier to find:

@oxarbitrage
Copy link
Contributor

oxarbitrage commented Dec 5, 2023

I also think it will help to merge this PR before we do much testing in CI, because it makes CI errors much easier to find:

You can do that if you want but it is not needed, the errors are visible locally running cargo test --features="shielded-scan. I want scan_task_starts and config_tests to run before we continue.

I already mentioned the temporal solution i have, if you think it will not worth it please let me know and ill leave it to you, i was not planning to work on this ticket anyways and i think i am putting more effort than i was expected already.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 5, 2023

It seems like a good idea to make progress for now.

Here are the benefits and drawbacks I see:

Benefits:

  • Zebra is more efficient, because we don't start a useless scanner task (we do the same thing with RPCs and mempool)
  • We fix some tests and get them working, so we can move on with the rest of the project

Drawbacks:

  • We'll need a separate scanner database conflict test
  • We'll need a separate test for an ephemeral config

But we need do both these tests anyway. If we are just doing them by accident, it's easy to accidentally remove those tests when we change other tests. We can open a separate ticket for them.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 6, 2023

  • We'll need a separate scanner database conflict test
  • We'll need a separate test for an ephemeral config

But we need do both these tests anyway. If we are just doing them by accident, it's easy to accidentally remove those tests when we change other tests. We can open a separate ticket for them.

This is now #8069

@teor2345 teor2345 self-assigned this Dec 7, 2023
@teor2345
Copy link
Contributor Author

teor2345 commented Dec 7, 2023

I'm going to pick this up because it seems like we'll need it to make sure our code is working.

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 7, 2023

I've added the repository variable again, here are the instructions to revert:

Instructions to revert:

  1. Get a repository or organisation admin
  2. Go to ZcashFoundation/zebra/settings/variables/actions
  3. Change the value of RUST_EXPERIMENTAL_FEATURES to (a single space)

@teor2345
Copy link
Contributor Author

teor2345 commented Dec 7, 2023

This seems to work after PR #8059, so all that's left to do here is the tests we split out in ticket #8069.

@teor2345 teor2345 closed this as completed Dec 7, 2023
@mpguerra mpguerra linked a pull request Dec 11, 2023 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-blockchain-scanner Area: Blockchain scanner of shielded transactions A-devops Area: Pipelines, CI/CD and Dockerfiles C-testing Category: These are tests
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants