Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rc vs 1036 updating docs for callset creation #8600

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion scripts/variantstore/docs/aou/AOU_DELIVERABLES.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,9 @@
- Run if there are any samples to withdraw from the last callset.
1. **TBD Workflow to soft delete samples**
1. `GvsPopulateAltAllele` workflow
- **TODO:** needs to be made cumulative so that it can add data to the existing table instead of creating it from scratch on each run (see [VS-52](https://broadworkbench.atlassian.net/browse/VS-52))
- This step loads data into the `alt_allele` table from the `vet_*` tables in preparation for running the filtering step.
- This workflow does not use the Terra Data Entity Model to run, so be sure to select the `Run workflow with inputs defined by file paths` workflow submission option.
- This step is cumulative (as are all of the steps prior), and it is the last step that is--so be sure that all samples have been loaded or withdrawn before progressing to the next step
Copy link
Contributor Author

@RoriCremer RoriCremer Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I lying??

if data is loaded into the alt allele table for a specific sample, and then that exact sample is withdrawn later...how do we adjust the alt allele table to be sure that the sample is not used in the filter model?

1. `GvsCreateFilterSet` workflow
- This step calculates features from the `alt_allele` table, and trains the VETS filtering model along with site-level QC filters and loads them into BigQuery into a series of `filter_set_*` tables.
- See [naming conventions doc](https://docs.google.com/document/d/1pNtuv7uDoiOFPbwe4zx5sAGH7MyxwKqXkyrpNmBxeow) for guidance on what to use for `filter_set_name`, which you will need to keep track of for the `GvsExtractAvroFilesForHail` WDL. If, for some reason, this step needs to be run multiple times, be sure to use a different `filter_set_name` (the doc has guidance for this, as well).
Expand Down
2 changes: 1 addition & 1 deletion scripts/variantstore/docs/aou/vds/Creating a VDS.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ NOTE: in the WDL created to run this script, the temp directory can be easily de

## Validate the VDS to ensure that it is ready to be shared

Copy the [VDS Validation python script](vds_validation.py) to the notebook environment.
Copy the [VDS Validation python script](../../../wdl/extract/vds_validation.py) to the notebook environment.
Run it with the following arguments:

`--vds-path`: the GCS path to the newly-created VDS
Expand Down
64 changes: 0 additions & 64 deletions scripts/variantstore/docs/aou/vds/vds_validation.py

This file was deleted.

Loading