Skip to content

Commit

Permalink
Update SASC-viz tool to show cell nodes and labels (fix Issue #1). Ad…
Browse files Browse the repository at this point in the history
…ded autowrap function for the labels
  • Loading branch information
Simone Ciccolella committed Sep 21, 2021
1 parent 31563a5 commit 1ec9146
Show file tree
Hide file tree
Showing 2 changed files with 154 additions and 55 deletions.
78 changes: 75 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,9 +142,26 @@ Usage
----------

```bash
python3 SASC-viz.py [-h] -t TREE [-E CELLNAMES | -n TOTCELL] [--show-support]
[--show-color] [--collapse-support COLLAPSE_SUPPORT]
[--collapse-simple] [--sep SEP]
usage: SASC-viz.py [-h] -t TREE [-E CELLNAMES | -n TOTCELL] [--show-support] [--show-color] [--show-cell-labels] [--collapse-support COLLAPSE_SUPPORT] [--collapse-simple] [--sep SEP] [--wrap-width WRAP_WIDTH]

SASC visualitation tool

optional arguments:
-h, --help show this help message and exit
-t TREE, --tree TREE path of the input file.
-E CELLNAMES, --cellnames CELLNAMES
path to the cell labels file
-n TOTCELL, --totcell TOTCELL
total number of cells
--show-support Show the support for each node.
--show-color Enable coloring of nodes.
--show-cell-labels Show cells nodes and their labels.
--collapse-support COLLAPSE_SUPPORT
Collapse path with lower support
--collapse-simple Collapse simple paths
--sep SEP Labels\' separator
--wrap-width WRAP_WIDTH
Max width to wrap labels. Set to 0 to no wrap.
```

**Input files required**
Expand All @@ -163,6 +180,8 @@ You are required to use either:
- `--collapse-support THRESHOLD`: if the support of a node _x_ is lower than the specified `THRESHOLD`, then _x_ is merged with its parent. This operation is performed in a bottom-up fashion starting from the leaves.
- `--show-color`: if this flag is used, the nodes will be colored using a gradient scale from red to green based on the support.
- `--sep STRING`: if this option is used the labels on the nodes will be separetad by `STRING`. Default is `,`.
- `--show-cell-labels`: if this flag is used, the program will show cells nodes and their labels.
- `--wrap-width INT`: set the max width to wrap the labels. If no wrap is wanted, set to 0. Default is 0.

**Output**

Expand Down Expand Up @@ -216,6 +235,59 @@ Output:

![examples/tree.png](examples/tree.png)

## Show cell labels and wrap text
```bash
python3 SASC-viz.py -t examples/MGH36_scs_mlt.gv -E data/real/MGH36/MGH36_cell-names.txt --collapse-simple --collapse-support 20 --show-support --show-cell-labels --wrap-width 40
```
Output:

```
digraph phylogeny {
node [penwidth=2];
"0" [label="germline [255 cells]"];
"0" -> "1";
"1" [label="IDH1, NOTCH2, RTTN, TBC1D10A, MLYCD,\nCACNA1G, CTNNA2, NRN1, APC2, IL33, RFX3,\nUBE2Z, SH3BP5, PIK3CA, NR3C1,\nRP11-356C4.3, VPS9D1, PLEKHM1,\nLINC00937, ST8SIA3, CPEB4, TRPM3,\nTRIOBP, CEP55, NBPF10, ZZEF1, CCDC181,\nPHLDB3, MCM8, ARHGEF3, AGAP2, NR5A2,\nZNF451, TFAP2A, KHSRP, VGLL4, ZNF721,\nKIF2A, USP36, IFT81, SVEP1 [s=100%]"];
"1" -> "1-cells";
"1-cells" [label="MGH36-P03-A10, MGH36-P03-A12,\nMGH36-P03-B03, MGH36-P03-B12,\nMGH36-P03-C06, MGH36-P03-C11,\nMGH36-P03-D04, MGH36-P03-D11,\nMGH36-P03-E06, MGH36-P03-E09,\nMGH36-P03-E11, MGH36-P03-E12,\nMGH36-P03-F05, MGH36-P03-F07,\nMGH36-P03-F11, MGH36-P03-G03,\nMGH36-P03-G11, MGH36-P03-G12,\nMGH36-P03-H05, MGH36-P03-H11,\nMGH36-P04-A03, MGH36-P04-A07,\nMGH36-P04-B04, MGH36-P04-B09,\nMGH36-P04-C02, MGH36-P04-C08,\nMGH36-P04-C10, MGH36-P04-C11,\nMGH36-P04-E11, MGH36-P04-F05,\nMGH36-P04-F06, MGH36-P04-F10,\nMGH36-P04-G02, MGH36-P04-G04,\nMGH36-P04-G05, MGH36-P04-H09,\nMGH36-P06-A07, MGH36-P06-A08,\nMGH36-P06-A12, MGH36-P06-B05,\nMGH36-P06-B06, MGH36-P06-C02,\nMGH36-P06-C03, MGH36-P06-D01,\nMGH36-P06-D02, MGH36-P06-D03,\nMGH36-P06-D08, MGH36-P06-E04,\nMGH36-P06-E06, MGH36-P06-E07,\nMGH36-P06-E09, MGH36-P06-E10,\nMGH36-P06-E11, MGH36-P06-F01,\nMGH36-P06-F06, MGH36-P06-F12,\nMGH36-P06-G01, MGH36-P06-G04,\nMGH36-P06-G05, MGH36-P06-H06,\nMGH36-P06-H07, MGH36-P07-A04,\nMGH36-P07-B03, MGH36-P07-B06,\nMGH36-P07-D02, MGH36-P07-D03,\nMGH36-P07-D07, MGH36-P07-D09,\nMGH36-P07-E04, MGH36-P07-E07,\nMGH36-P07-E10, MGH36-P07-F02,\nMGH36-P07-F09, MGH36-P07-F11,\nMGH36-P07-F12, MGH36-P07-G06,\nMGH36-P07-H08, MGH36-P07-H09,\nMGH36-P08-A01, MGH36-P08-A06,\nMGH36-P08-A09, MGH36-P08-B01,\nMGH36-P08-B04, MGH36-P08-B05,\nMGH36-P08-C02, MGH36-P08-C10,\nMGH36-P08-D01, MGH36-P08-D03,\nMGH36-P08-D04, MGH36-P08-D05,\nMGH36-P08-D10, MGH36-P08-D11,\nMGH36-P08-E01, MGH36-P08-E03,\nMGH36-P08-E10, MGH36-P08-F02,\nMGH36-P08-G01, MGH36-P08-G04,\nMGH36-P08-G07, MGH36-P08-G09,\nMGH36-P08-G10, MGH36-P08-H01,\nMGH36-P08-H06, MGH36-P08-H07,\nMGH36-P08-H11, MGH36-P09-A10,\nMGH36-P09-B07, MGH36-P09-B12,\nMGH36-P09-C09, MGH36-P09-C11,\nMGH36-P09-E02, MGH36-P09-E03,\nMGH36-P09-F03, MGH36-P09-F08,\nMGH36-P09-F09, MGH36-P09-F11,\nMGH36-P09-F12, MGH36-P09-G03,\nMGH36-P09-G04, MGH36-P09-G05,\nMGH36-P09-G08, MGH36-P09-G11,\nMGH36-P09-H02, MGH36-P09-H03,\nMGH36-P09-H04, MGH36-P09-H08,\nMGH36-P09-H10, MGH36-P09-H12,\nMGH36-P10-A02, MGH36-P10-A04,\nMGH36-P10-A05, MGH36-P10-A07,\nMGH36-P10-B01, MGH36-P10-B03,\nMGH36-P10-B05, MGH36-P10-B08,\nMGH36-P10-C06, MGH36-P10-C12,\nMGH36-P10-D04, MGH36-P10-D05,\nMGH36-P10-E03, MGH36-P10-E12,\nMGH36-P10-F02, MGH36-P10-G02,\nMGH36-P10-G03, MGH36-P10-G08,\nMGH36-P10-G09, MGH36-P10-G10,\nMGH36-P10-H11, MGH36-P03-F06,\nMGH36-P04-E01, MGH36-P07-H01,\nMGH36-P08-B09, MGH36-P09-E04,\nMGH36-P06-C06, MGH36-P06-G12,\nMGH36-P06-E05, MGH36-P06-G07,\nMGH36-P03-C08, MGH36-P07-A03,\nMGH36-P08-B03, MGH36-P10-G06,\nMGH36-P06-E03, MGH36-P09-E11,\nMGH36-P04-D07, MGH36-P04-F02,\nMGH36-P04-F03, MGH36-P10-C10,\nMGH36-P10-G05, MGH36-P07-A11,\nMGH36-P07-B09, MGH36-P04-E07,\nMGH36-P09-F06, MGH36-P06-E08,\nMGH36-P08-E05, MGH36-P09-E01,\nMGH36-P10-E11, MGH36-P07-B08", shape=rect];
"1" -> "23";
"23" [label="CEBPZ, DGCR6L, MAN1B1, ENO3, ZNF526,\nMIR4477B, KMT2C, SLC26A11, ORC3, KAT6A,\nCNNM2, CLEC18B, SLC16A7 [s=76%]"];
"23" -> "23-cells";
"23-cells" [label="MGH36-P08-D07, MGH36-P10-D10,\nMGH36-P10-G07, MGH36-P04-E10,\nMGH36-P03-C05, MGH36-P03-E05,\nMGH36-P03-E07, MGH36-P03-E08,\nMGH36-P03-E10, MGH36-P03-F02,\nMGH36-P03-G01, MGH36-P03-G02,\nMGH36-P03-G04, MGH36-P04-A08,\nMGH36-P04-B05, MGH36-P04-B06,\nMGH36-P04-C09, MGH36-P04-E06,\nMGH36-P04-E09, MGH36-P04-G06,\nMGH36-P04-G11, MGH36-P04-H04,\nMGH36-P04-H07, MGH36-P06-A10,\nMGH36-P06-A11, MGH36-P06-B04,\nMGH36-P06-C08, MGH36-P06-D10,\nMGH36-P06-E12, MGH36-P06-F07,\nMGH36-P06-G03, MGH36-P06-G06,\nMGH36-P06-G10, MGH36-P06-H10,\nMGH36-P06-H12, MGH36-P07-B02,\nMGH36-P07-B04, MGH36-P07-B07,\nMGH36-P07-B11, MGH36-P07-D01,\nMGH36-P07-E01, MGH36-P07-E05,\nMGH36-P07-F03, MGH36-P07-F04,\nMGH36-P07-G01, MGH36-P07-H02,\nMGH36-P07-H03, MGH36-P07-H11,\nMGH36-P08-A08, MGH36-P08-B02,\nMGH36-P08-C01, MGH36-P08-C08,\nMGH36-P08-D08, MGH36-P08-G06,\nMGH36-P09-A07, MGH36-P09-E05,\nMGH36-P09-E06, MGH36-P09-E07,\nMGH36-P09-G06, MGH36-P09-H11,\nMGH36-P10-B12, MGH36-P10-C01,\nMGH36-P10-C09, MGH36-P10-D01,\nMGH36-P10-F08, MGH36-P10-F10,\nMGH36-P10-F12, MGH36-P10-G01,\nMGH36-P03-B07, MGH36-P03-B08,\nMGH36-P06-A02, MGH36-P07-E06,\nMGH36-P09-A11, MGH36-P09-B08,\nMGH36-P04-F12, MGH36-P06-D07,\nMGH36-P06-B10, MGH36-P08-G05,\nMGH36-P10-E02", shape=rect];
"23" -> "27";
"27" [label="PCDHA1 [s=28%]"];
"27" -> "27-cells";
"27-cells" [label="MGH36-P10-F11", shape=rect];
"27" -> "28";
"28" [label="HELZ2, RIN2 [s=50%]"];
"28" -> "28-cells";
"28-cells" [label="MGH36-P04-B08, MGH36-P04-C07,\nMGH36-P08-C05, MGH36-P10-D03", shape=rect];
"27" -> "30";
"30" [label="TXNDC2, HEATR4 [s=25%]"];
"30" -> "30-cells";
"30-cells" [label="MGH36-P03-A05, MGH36-P04-D01", shape=rect];
"27" -> "32";
"32" [label="NPEPL1 [s=25%]"];
"32" -> "32-cells";
"32-cells" [label="MGH36-P03-D07, MGH36-P06-F10", shape=rect];
"23" -> "33";
"33" [label="EEF1B2, ZNF462, EP400, RP11-403I13.8\n[s=71%]"];
"33" -> "33-cells";
"33-cells" [label="MGH36-P03-A09, MGH36-P03-F04,\nMGH36-P04-A01, MGH36-P04-F01,\nMGH36-P06-B02, MGH36-P06-B11,\nMGH36-P06-C04, MGH36-P06-D12,\nMGH36-P07-A07, MGH36-P07-A09,\nMGH36-P07-A12, MGH36-P07-C11,\nMGH36-P07-D10, MGH36-P07-F06,\nMGH36-P07-G11, MGH36-P07-H05,\nMGH36-P08-D12, MGH36-P08-E02,\nMGH36-P08-E08, MGH36-P09-A12,\nMGH36-P09-B11, MGH36-P10-B07,\nMGH36-P10-E07", shape=rect];
"1" -> "46";
"46" [label="HLA-DQB2, ABCA7, STXBP1, RUNX2, SOX5,\nKIAA0907, CPAMD8 [s=23%]"];
"46" -> "46-cells";
"46-cells" [label="MGH36-P03-B06, MGH36-P03-E01,\nMGH36-P03-H06, MGH36-P04-A10,\nMGH36-P04-B03, MGH36-P04-B10,\nMGH36-P04-D06, MGH36-P04-G12,\nMGH36-P04-H06, MGH36-P04-H08,\nMGH36-P04-H11, MGH36-P06-B07,\nMGH36-P06-E01, MGH36-P06-F09,\nMGH36-P06-F11, MGH36-P06-H05,\nMGH36-P06-H11, MGH36-P07-F07,\nMGH36-P07-H06, MGH36-P08-F01,\nMGH36-P08-F08, MGH36-P08-F11,\nMGH36-P09-F05, MGH36-P10-E06,\nMGH36-P10-E09, MGH36-P10-F01,\nMGH36-P10-H08", shape=rect];
"46" -> "47";
"47" [label="ANKRD30B, FAM182B, TRPM2, AS3MT [s=25%]"];
"47" -> "47-cells";
"47-cells" [label="MGH36-P07-A02, MGH36-P08-F06", shape=rect];
"46" -> "52";
"52" [label="EMR2, CYP27A1 [s=75%]"];
"52" -> "52-cells";
"52-cells" [label="MGH36-P06-A04, MGH36-P06-C01,\nMGH36-P08-G11, MGH36-P09-D02,\nMGH36-P09-D12, MGH36-P09-E12", shape=rect];
}
```

Definition of mutation support
-----------------------------
The support _s<sub>i</sub>_ of a mutation _i_ is computed on the _n x m_ inferred matrix _E_ as follows.
Expand Down
131 changes: 79 additions & 52 deletions SASC-viz.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
from colour import Color
import re
import sys
import textwrap

def load_from_sasc(filepath, cell_labels, show_support=False, score=True, show_color=True):
tree = []
Expand Down Expand Up @@ -54,6 +55,7 @@ def load_from_sasc(filepath, cell_labels, show_support=False, score=True, show_c
s = int(s)
x = TREE.get_node(s)
x.support += 1
x.cells_labels.append(e)
continue

s = int(s)
Expand Down Expand Up @@ -118,15 +120,19 @@ def __init__(self, id):
self.downstream_support = 0

self.tot_cells = 0
self.cells_labels = []
self.show_support = False
self.show_color = False

def get_name(self, sep=','):
def get_name(self, sep=', '):
if not self.deletion:
return sep.join(self.mutations)
else:
return sep.join('%s-' % x for x in self.mutations)

def get_cellslabels(self, sep=', '):
return sep.join(self.cells_labels)

def get_s(self):
# return int((self.downstream_support / (self.tot_cells - self.parent.cumulative_support))*100)
# print(self.downstream_support, self.parent.downstream_support - self.parent.support)
Expand All @@ -137,7 +143,7 @@ def get_s(self):
except:
return 0

def print_node_dot(self, sep=','):
def print_node_dot(self, sep=', ', wrap_width=40, print_cellslabels=False):
c_red = Color("#FF1919")
c_green = Color("#397D02")
c_blue = Color("#3270FC")
Expand Down Expand Up @@ -176,6 +182,9 @@ def print_node_dot(self, sep=','):
else:
print_label = self.get_name(sep=sep)

if wrap_width > 0:
print_label = '\\n'.join(textwrap.wrap(print_label, break_long_words=False, width=wrap_width))

if self.show_color:
color_label = ', color="{}"'.format(color)
else:
Expand All @@ -193,6 +202,17 @@ def print_node_dot(self, sep=','):
print_label,
color_label
))

if print_cellslabels:
labels = self.get_cellslabels(sep=sep)
if wrap_width > 0:
labels = '\\n'.join(textwrap.wrap(labels, break_long_words=False, width=wrap_width))
print('\t"%s" -> "%s-cells";' % (self.id, self.id))
print('\t"{0}-cells" [label="{1}", shape=rect];'.format(
self.id,
labels
))


def calc_cumalitve_sup(self):
if self.parent:
Expand Down Expand Up @@ -238,6 +258,7 @@ def merge_nodes(self, merged, to_merge):

for m in to_merge.mutations:
merged.mutations.append(m)
merged.cells_labels += to_merge.cells_labels
self.remove_node(to_merge)
for m in merged.mutations:
self.mut_to_node[m] = merged
Expand Down Expand Up @@ -320,25 +341,25 @@ def delete_subtree(tree, node):
delete_subtree(tree, child)
tree.pop_node(node)

def __print_tree(node, ds_filter=0, sep=','):
def __print_tree(node, ds_filter=0, **kwargs):
if len(node.children) == 0:
if node.cumulative_support >= ds_filter:
node.print_node_dot(sep=sep)
node.print_node_dot(**kwargs)
else:
if node.cumulative_support >= ds_filter:
node.print_node_dot(sep=sep)
node.print_node_dot(**kwargs)
for child in node.children:
__print_tree(child, ds_filter=ds_filter, sep=sep)
__print_tree(child, ds_filter=ds_filter, **kwargs)

def print_dot_tree(node, ds_filter=0, sep=',', show_support=False):
def print_dot_tree(node, ds_filter=0, show_support=False, **kwargs):
print('digraph phylogeny {')
print('\tnode [penwidth=2];')
if show_support:
support = ' [{} cells]'.format(node.support)
else:
support = ''
print('\t"{0}" [label="{1}{2}"];'.format(node.id, ','.join(node.mutations), support))
__print_tree(node, ds_filter=ds_filter, sep=sep)
__print_tree(node, ds_filter=ds_filter, **kwargs)
print('}')

def calc_supports(node, level_count):
Expand All @@ -357,67 +378,73 @@ def calc_supports(node, level_count):



if __name__ == '__main__':
parser = argparse.ArgumentParser(description='SASC visualitation tool', add_help=True)

parser = argparse.ArgumentParser(description='SASC visualitation tool', add_help=True)
parser.add_argument('-t', '--tree', action='store', type=str, required=True,
help='path of the input file.')
group = parser.add_mutually_exclusive_group()
group.add_argument('-E', '--cellnames', action='store', type=str,
required=False,
help="path to the cell labels file")
group.add_argument('-n', '--totcell', action='store', type=int,
required=False,
help="total number of cells")

parser.add_argument('-t', '--tree', action='store', type=str, required=True,
help='path of the input file.')
group = parser.add_mutually_exclusive_group()
group.add_argument('-E', '--cellnames', action='store', type=str,
required=False,
help="path to the cell labels file")
group.add_argument('-n', '--totcell', action='store', type=int,
required=False,
help="total number of cells")
parser.add_argument('--show-support', action='store_true', required=False,
help="Show the support for each node.")

parser.add_argument('--show-support', action='store_true', required=False,
help="Show the support for each node.")
parser.add_argument('--show-color', action='store_true', required=False,
help="Enable coloring of nodes.")

parser.add_argument('--show-color', action='store_true', required=False,
help="Enable coloring of nodes.")
parser.add_argument('--show-cell-labels', action='store_true', required=False,
help="Show cells nodes and their labels.")

parser.add_argument('--collapse-support', action='store', type=float,
required=False,
help="Collapse path with lower support")
parser.add_argument('--collapse-support', action='store', type=float,
required=False,
help="Collapse path with lower support")

parser.add_argument('--collapse-simple', action='store_true',
required=False,
help="Collapse simple paths")
parser.add_argument('--collapse-simple', action='store_true',
required=False,
help="Collapse simple paths")

parser.add_argument('--sep', action='store', default=',', type=str,
help="Labels' separator")
parser.add_argument('--sep', action='store', default=', ', type=str,
help="Labels' separator")

args = parser.parse_args()
parser.add_argument('--wrap-width', action='store', default=0, type=int,
help="Max width to wrap labels. Set to 0 to no wrap.")

cells_labels = set()
args = parser.parse_args()

if args.cellnames:
with open(args.cellnames) as fin:
for line in fin:
cells_labels.add(line.strip())
else:
for x in range(args.totcell):
cells_labels.add('cell{}'.format(x+1))
cells_labels = set()

if args.cellnames:
with open(args.cellnames) as fin:
for line in fin:
cells_labels.add(line.strip())
else:
for x in range(args.totcell):
cells_labels.add('cell{}'.format(x+1))

x = load_from_sasc(args.tree, cells_labels, show_support=args.show_support, show_color=args.show_color, score=True)
x = load_from_sasc(args.tree, cells_labels, show_support=args.show_support, show_color=args.show_color, score=True)

from collections import defaultdict
from collections import defaultdict

lev_count = defaultdict(int)
lev_count[0] = 1
lev_count = defaultdict(int)
lev_count[0] = 1

calc_supports(x.root, lev_count)
calc_supports(x.root, lev_count)

if args.collapse_support:
collapse_low_support(x, x.root, args.collapse_support)
if args.collapse_support:
collapse_low_support(x, x.root, args.collapse_support)

if args.collapse_simple:
collapse_simple_paths(x, x.root)
if args.collapse_simple:
collapse_simple_paths(x, x.root)


lev_count = defaultdict(int)
lev_count[0] = 1
lev_count = defaultdict(int)
lev_count[0] = 1

calc_supports(x.root, lev_count)
calc_supports(x.root, lev_count)

print_dot_tree(x.root, sep=args.sep, show_support=args.show_support)
print_dot_tree(x.root, sep=args.sep, show_support=args.show_support, wrap_width=args.wrap_width, print_cellslabels=args.show_cell_labels)

0 comments on commit 1ec9146

Please sign in to comment.