Skip to content

Commit

Permalink
Move documentation to website (#737)
Browse files Browse the repository at this point in the history
  • Loading branch information
mwalker174 authored Oct 25, 2024
1 parent ae9cd84 commit eb2e5b0
Show file tree
Hide file tree
Showing 66 changed files with 2,957 additions and 1,317 deletions.
626 changes: 6 additions & 620 deletions README.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion wdl/GenotypeBatch.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ workflow GenotypeBatch {
File? pesr_exclude_list # Required unless skipping training
File splitfile
File? splitfile_index
String? reference_build #hg19 or hg38, Required unless skipping training
String? reference_build # Must be hg38, Required unless skipping training
File bin_exclude
File ref_dict
# If all specified, training will be skipped (for single sample pipeline)
Expand Down
18 changes: 18 additions & 0 deletions website/docs/acknowledgements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: Acknowledgements
description: Acknowledgements
sidebar_position: 10
---

The following resources were produced using data from the [All of Us Research Program](https://allofus.nih.gov/)
and have been approved by the Program for public dissemination:

* Genotype filtering model: "aou_recalibrate_gq_model_file" in "inputs/values/resources_hg38.json"

The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional
Medical Centers: 1 OT2 OD026549; 1 OT2 OD026554; 1 OT2 OD026557; 1 OT2 OD026556; 1 OT2 OD026550; 1 OT2 OD 026552; 1
OT2 OD026553; 1 OT2 OD026548; 1 OT2 OD026551; 1 OT2 OD026555; IAA #: AOD 16037; Federally Qualified Health Centers:
HHSN 263201600085U; Data and Research Center: 5 U2C OD023196; Biobank: 1 U24 OD023121; The Participant Center: U24
OD023176; Participant Technology Systems Center: 1 U24 OD023163; Communications and Engagement: 3 OT2 OD023205; 3 OT2
OD023206; and Community Partners: 1 OT2 OD025277; 3 OT2 OD025315; 1 OT2 OD025337; 1 OT2 OD025276. In addition, the All
of Us Research Program would not be possible without the partnership of its participants.
2 changes: 1 addition & 1 deletion website/docs/advanced/_category_.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"label": "Advanced Guides",
"position": 8,
"position": 9,
"link": {
"type": "generated-index"
}
Expand Down
70 changes: 1 addition & 69 deletions website/docs/advanced/build_inputs.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Building inputs
description: Building work input json files
sidebar_position: 1
sidebar_position: 3
slug: build_inputs
---

Expand Down Expand Up @@ -43,8 +43,6 @@ You may run the following commands to get these example inputs.
└── test
```

## Building inputs for specific use-cases (Advanced)

### Build for batched workflows

```shell
Expand All @@ -55,69 +53,3 @@ python scripts/inputs/build_inputs.py \
-a '{ "test_batch" : "ref_panel_1kg" }'
```


### Generating a reference panel

This section only applies to the single-sample mode.
New reference panels can be generated from a single run of the
`GATKSVPipelineBatch` workflow.
If using a Cromwell server, we recommend copying the outputs to a
permanent location by adding the following option to the
[workflow configuration](https://cromwell.readthedocs.io/en/latest/wf_options/Overview/)
file:

```json
"final_workflow_outputs_dir" : "gs://my-outputs-bucket",
"use_relative_output_paths": false,
```

Here is an example of how to generate workflow input jsons from `GATKSVPipelineBatch` workflow metadata:

1. Get metadata from Cromwshell.

```shell
cromshell -t60 metadata 38c65ca4-2a07-4805-86b6-214696075fef > metadata.json
```

2. Run the script.

```shell
python scripts/inputs/create_test_batch.py \
--execution-bucket gs://my-exec-bucket \
--final-workflow-outputs-dir gs://my-outputs-bucket \
metadata.json \
> inputs/values/my_ref_panel.json
```

3. Build test files for batched workflows (google cloud project id required).

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test \
inputs/build/my_ref_panel/test \
-a '{ "test_batch" : "ref_panel_1kg" }'
```

4. Build test files for the single-sample workflow

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test/GATKSVPipelineSingleSample \
inputs/build/NA19240/test_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA19240", "ref_panel" : "my_ref_panel" }'
```

5. Build files for a Terra workspace.

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/terra_workspaces/single_sample \
inputs/build/NA12878/terra_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA12878", "ref_panel" : "my_ref_panel" }'
```

Note that the inputs to `GATKSVPipelineBatch` may be used as resources
for the reference panel and therefore should also be in a permanent location.
73 changes: 73 additions & 0 deletions website/docs/advanced/build_ref_panel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
title: Building reference panels
description: Building reference panels for the single-sample pipeline
sidebar_position: 4
slug: build_ref_panel
---

A custom reference panel for the [single-sample mode](/docs/gs/calling_modes#single-sample-mode) can be generated most easily using the
[GATKSVPipelineBatch](https://github.com/broadinstitute/gatk-sv/blob/main/wdl/GATKSVPipelineBatch.wdl) workflow.
This must be run on a standalone Cromwell server, as the workflow is unstable on Terra.

:::note
Reference panels can also be generated by running the pipeline through joint calling on Terra, but there is
currently no solution for automatically updating inputs.
:::

We recommend copying the outputs from a Cromwell run to a permanent location by adding the following option to
the workflow configuration file:
```
"final_workflow_outputs_dir" : "gs://my-outputs-bucket",
"use_relative_output_paths": false,
```

Here is an example of how to generate workflow input jsons from `GATKSVPipelineBatch` workflow metadata:

1. Get metadata from Cromwshell.

```shell
cromshell -t60 metadata 38c65ca4-2a07-4805-86b6-214696075fef > metadata.json
```

2. Run the script.

```shell
python scripts/inputs/create_test_batch.py \
--execution-bucket gs://my-exec-bucket \
--final-workflow-outputs-dir gs://my-outputs-bucket \
metadata.json \
> inputs/values/my_ref_panel.json
```

3. Build test files for batched workflows (google cloud project id required).

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test \
inputs/build/my_ref_panel/test \
-a '{ "test_batch" : "ref_panel_1kg" }'
```

4. Build test files for the single-sample workflow

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test/GATKSVPipelineSingleSample \
inputs/build/NA19240/test_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA19240", "ref_panel" : "my_ref_panel" }'
```

5. Build files for a Terra workspace.

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/terra_workspaces/single_sample \
inputs/build/NA12878/terra_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA12878", "ref_panel" : "my_ref_panel" }'
```

Note that the inputs to `GATKSVPipelineBatch` may be used as resources
for the reference panel and therefore should also be in a permanent location.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"label": "Development",
"position": 6,
"label": "Cromwell",
"position": 1,
"link": {
"type": "generated-index"
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Cromwell
description: Running GATK-SV on Cromwell
title: Overview
description: Introduction to Cromwell
sidebar_position: 0
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
---
title: Quick Start
description: Run the pipeline on demo data.
title: Run
description: Running GATK-SV on Cromwell
sidebar_position: 1
slug: ./qs
---

This page provides steps for running the pipeline using demo data.

# Quick Start on Cromwell

This section walks you through the steps of running pipeline using
demo data on a managed Cromwell server.

### Setup Environment
### Environment Setup

- A running instance of a Cromwell server.

Expand Down
4 changes: 2 additions & 2 deletions website/docs/advanced/docker/_category_.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"label": "Docker Images",
"position": 7,
"label": "Docker builds",
"position": 2,
"link": {
"type": "generated-index"
}
Expand Down
18 changes: 18 additions & 0 deletions website/docs/best_practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: Best Practices Guide
description: Guide for using GATK-SV
sidebar_position: 4
---

A comprehensive guide for the single-sample calling mode is available in [GATK Best Practices for Structural Variation
Discovery on Single Samples](https://gatk.broadinstitute.org/hc/en-us/articles/9022653744283-GATK-Best-Practices-for-Structural-Variation-Discovery-on-Single-Samples).
This material covers basic concepts of structural variant calling, specifics of SV VCF formatting, and
advanced troubleshooting that also apply to the joint calling mode as well. This guide is intended to supplement
documentation found here.

Users should also review the [Getting Started](/docs/gs/overview) section before attempting to perform SV calling.

The following sections also contain recommendations pertaining to data and call set QC:

- Preliminary sample QC in the [EvidenceQc module](/docs/modules/eqc#preliminary-sample-qc).
- Assessment of completed call sets can be found on the [MainVcfQc module page](/docs/modules/mvqc).
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"label": "Run",
"label": "Execution",
"position": 4,
"link": {
"type": "generated-index"
Expand Down
Loading

0 comments on commit eb2e5b0

Please sign in to comment.