Skip to content

Commit

Permalink
Merge pull request #43 from AlexandrovLab/development
Browse files Browse the repository at this point in the history
Development
  • Loading branch information
mdbarnesUCSD authored Oct 5, 2022
2 parents 3c258eb + 3eebbd0 commit 3095567
Show file tree
Hide file tree
Showing 36 changed files with 16,030 additions and 13,244 deletions.
5 changes: 2 additions & 3 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,5 @@ include SigProfilerAssignment/data/Reference_Signatures/GRCh38/*
include SigProfilerAssignment/data/Reference_Signatures/mm9/*
include SigProfilerAssignment/data/Reference_Signatures/mm10/*
include SigProfilerAssignment/data/Reference_Signatures/rn6/*
include SigProfilerAssignment/src/FormatFiles/*
include SigProfilerAssignment/src/Fonts/*
include SigProfilerAssignment/src/*
include SigProfilerAssignment/DecompositionPlots/reference_files/Fonts/*
include SigProfilerAssignment/DecompositionPlots/reference_files/*
53 changes: 44 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@



<img src="SigProfilerAssignment/src/figures/SigProfilerAssignment.png" alt="drawing" width="1000"/>
<img src="SigProfilerAssignment/figures/SigProfilerAssignment.png" alt="drawing" width="1000"/>

# SigProfilerAssignment
SigProfilerAssignment is a new mutational attribution and decomposition tool that performs the following functions:
Expand Down Expand Up @@ -72,7 +72,7 @@ spa_analyze( samples, output, signatures=None, signature_database=None,decompo
``` -->
### Decompose Fit
Decomposes the De Novo Signatures into COSMIC Signatures and assigns COSMIC signatures into samples.
<img src="SigProfilerAssignment/src/figures/decomp_pic.jpg" alt="drawing" width="600"/>
<img src="SigProfilerAssignment/figures/decomp_pic.jpg" alt="drawing" width="600"/>

```python
from SigProfilerAssignment import Analyzer as Analyze
Expand All @@ -86,9 +86,12 @@ Analyze.decompose_fit(samples,
exclude_signature_subgroups=exclude_signature_subgroups,
exome=False)
```

## Analysis

### *De Novo* Fit
Attributes mutations of given Samples to input denovo signatures.
<img src="SigProfilerAssignment/src/figures/denovo_fit.jpg" alt="drawing" width="600"/>
<img src="SigProfilerAssignment/figures/denovo_fit.jpg" alt="drawing" width="600"/>

```python
from SigProfilerAssignment import Analyzer as Analyze
Expand All @@ -102,7 +105,7 @@ Analyze.denovo_fit( samples,
### COSMIC Fit
Attributes mutations of given Samples to input COSMIC signatures. Note that penalties associated with denovo fit and COSMIC fits are different.

<img src="SigProfilerAssignment/src/figures/cosmic_fit.jpg" alt="drawing" width="600"/>
<img src="SigProfilerAssignment/figures/cosmic_fit.jpg" alt="drawing" width="600"/>

```python
from SigProfilerAssignment import Analyzer as Analyze
Expand All @@ -121,23 +124,25 @@ Analyze.cosmic_fit( samples,
## Main Parameters
| Parameter | Variable Type | Parameter Description |
| --------------------- | -------- |-------- |
| **samples** | String | Path to a tab delimilted file that contains the samples table where the rows are mutation types and colunms are sample IDs. or Path to VCF files directory if input files are VCF Files. |
| **samples** | String | Path to input file for `input_type`:<ul><li>"matrix"</li><li>"seg:TYPE"</li></ul> Path to input folder for `input_type`:<ul><li>"vcf"</li></ul>|
| **output** | String | Path to the output folder. |
| **input_type** | String | The type of input:<br><ul><li>"vcf": used for vcf format inputs.</li><li>"matrix": used for table format inputs using a tab seperated file.</li></ul> Default value is "matrix"|
| **input_type** | String | The type of input:<br><ul><li>"matrix": used for table format inputs using a tab-separated file where the rows are mutation types and the columns are sample IDs.</li><li>"vcf": used for mutation calling file inputs (VCFs, MAFs or simple text files).</li><li>"seg:TYPE": used for a multi-sample segmentation file for copy number analysis. The accepted callers for TYPE are the following {"ASCAT", "ASCAT_NGS", "SEQUENZA", "ABSOLUTE", "BATTENBERG", "FACETS", "PURPLE", "TCGA"}. For example, when using segmentation file from BATTENBERG then set input_type to "seg:BATTENBERG".</li></ul> The default value is "matrix".|
| **context_type**| String| Required context type if `input_type` is "vcf". `context_type` takes which context type of the input data is considered for assignment. Valid options include "96", "288", "1536", "DINUC", and "INDEL". The default value is "96".|
| **signatures** | String | Path to a tab delimited file that contains the signature table where the rows are mutation types and colunms are signature IDs. |
| **genome_build** | String | The reference genome build. List of supported genomes: "GRCh37", "GRCh38", "mm9", "mm10" and "rn6". The default value is "GRCh37". If the selected genome is not in the supported list, the default genome will be used. |
| **cosmic_version** | Float | Takes a positive float among 1, 2, 3, 3.1, 3.2 and 3.3. Defines the version of the COSMIC reference signatures. The default value is 3.3. |
| **new_signature_thresh_hold**| Float | Parameter in cosine similarity to declare a new signature. Applicable for decompose_fit only. The default value is 0.8. |
| **make_plots** | Boolean | Toggle on and off for making and saving all plots. Default value is True. |
| **exclude_signature_subgroups** | List | Removes the signatures corresponding to specific subtypes for better fitting. The usage is given above. Default value is None. |
| **exome** | Boolean | Defines if the exome renormalized signatures will be used. The default value is False. |
| **context_type**| String| Reqd context type if "input_type" is "vcf". 'context_type' takes what context type of the mutation matrix to be considered for assignment. Valid options include '96', '6', '24', '4608', '288', '18','6144', '384', '1536', 'DINUC'. Default Value is '96'|
| **verbose** | Boolean | Prints statements. Default value is False. |



#### SPA analysis Example for a matrix
## Examples

### SPA analysis - Example for a matrix


```python
Expand Down Expand Up @@ -167,7 +172,7 @@ Analyze.cosmic_fit( samples,

```

#### SPA analysis Example for input vcf files
### SPA analysis - Example for input vcf files


```python
Expand Down Expand Up @@ -198,6 +203,36 @@ Analyze.cosmic_fit( samples,
exome=False)

```

### SPA analysis - Example for an input multi-sample segmentation file


```python
#import modules
import SigProfilerAssignment as spa
from SigProfilerAssignment import Analyzer as Analyze

#set directories and paths to signatures and samples
dir_inp = spa.__path__[0]+'/data/Examples/'
samples = spa.__path__[0]+'/data/cnvtest/all.breast.ascat.summary.sample.tsv' # segmentation file
output = "output_example/"

#Analysis of SP Assignment
Analyze.cosmic_fit( samples,
output,
input_type="seg:ASCAT_NGS",
context_type="CNV48",
signatures=None,
signature_database=None,
genome_build="GRCh37",
cosmic_version=3.3,
verbose=False,
collapse_to_SBS96=False,
make_plots=True,
exclude_signature_subgroups=None,
exome=False)
```

## <a name="copyright"></a> Copyright
This software and its documentation are copyright 2022 as a part of the SigProfiler project. The SigProfilerAssignment framework is free software and is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Expand Down
Loading

0 comments on commit 3095567

Please sign in to comment.