Skip to content

broadinstitute/longbow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
BumpVersion Action
Nov 24, 2022
7bdff93 · Nov 24, 2022
Jun 28, 2022
Aug 30, 2022
Nov 8, 2022
Feb 13, 2022
Nov 24, 2022
Nov 5, 2022
Aug 24, 2021
Nov 24, 2022
Apr 21, 2021
Jan 27, 2022
Apr 21, 2021
Jan 27, 2021
Aug 9, 2021
Nov 24, 2022
Jan 27, 2021
Jan 27, 2021
Nov 24, 2022
Sep 24, 2022
Nov 24, 2022
Jan 27, 2021
Nov 24, 2022
Nov 5, 2022
Mar 3, 2022

Repository files navigation

Longbow

GitHub release Generic badge PyPI version maslongbow

Longbow is a command line tool to process MAS-ISO-seq data. Longbow employs a generative modelling approach to accurately annotate and segment MAS-ISO-seq's concatenated full-length transcript isoforms from single-cell or bulk long read RNA sequencing libraries.

Documentation for all longbow commands can be found on the Longbow documentation page.

Installation

pip is recommended for Longbow installation.

pip install maslongbow

For a pre-built version including all dependencies, access our Docker image.

docker pull us.gcr.io/broad-dsp-lrma/lr-longbow:0.6.9

To install from Github source for development, the following commands can be run.

git clone https://github.com/broadinstitute/longbow.git
pip install -e longbow/

Getting Started

The commands below illustrate the Longbow workflow on a small library of SIRVs (Spike-in RNA Variant Control Mixes). MAS-ISO-seq concatenated transcripts are annotated, segmented, and filtered using the mas15 model. A number of statistics and QC images are generated along the way. Final filtered transcripts can then be aligned using standard splice-aware long read mappers (e.g. minimap2). More detail for each command can be found in the full documentation.

# Download a tiny test dataset (less than 300K)
wget https://github.com/broadinstitute/longbow/raw/main/tests/test_data/mas15_test_input.bam
wget https://github.com/broadinstitute/longbow/raw/main/tests/test_data/mas15_test_input.bam.pbi
wget https://github.com/broadinstitute/longbow/raw/main/tests/test_data/resources/SIRV_Library.fasta

# Basic processing workflow
longbow annotate -m mas_15+sc_10x5p mas15_test_input.bam | \  # Annotate reads according to the mas_15+sc_10x5p model
  tee ann.bam | \                                             # Save annotated BAM for later
  longbow filter | \                                          # Filter out improperly-constructed arrays
  longbow segment | \                                         # Segment reads according to the model
  longbow extract -o filter_passed.bam                        # Extract adapter-free cDNA sequences

# Align reads with long read aligner (e.g. minimap2, pbmm2)
samtools fastq filter_passed.bam | \
  minimap2 -ayYL --MD -x splice:hq SIRV_Library.fasta - | \
  samtools sort > align.bam &&
  samtools index align.bam

Getting help

The Longbow documentation page provides detailed descriptions of command line options and algorithmic details. If you encounter bugs or have questions/comments/concerns, please file an issue on our Github page.

Developers' guide

For information on contributing to Longbow development, visit our developer documentation.