ArrowSAM

ArrowSAM is an in-memory Sequence Alignment/Map (SAM) representation which uses Apache Arrow framework (A cross-language development platform for in-memory data) and Plasma (Shared-Memory) Object Store to store and process SAM columnar data in-memory.

Citing ArrowSAM

The following paper describes the ArrowSAM format and its usage to speedup genomics pipelines. If you use ArrowSAM in your work, please cite the following paper.

Ahmad et al., (2020). "ArrowSAM: In-Memory Genomics Data Processing Using Apache Arrow", ICCAIS. doi.org/10.1109/ICCAIS48893.2020.9096725

Ahmad et al., "Optimizing performance of GATK workflows using Apache Arrow In-Memory data framework", BMC Genomics, presented at APBC2020. https://doi.org/10.1186/s12864-020-07013-y

This repo contains following three components:

ArrowSAM (In-memory SAM data representation) integrated BWA-MEM, Picard and GATK tools.
A Singularity container def file (To create an environment to use all Apache Arrow related tools and libraries for ArrowSAM).
Scripts to run different GATK best practices recommended workflows (using different in-memory data placement techniques like ArrowSAM, ramDisk and pipes for fast processing) to run complete DNA analysis pipeline efficiently.

Note: ArrowSAM and all other workflows are based on single node, multi-core machines.

How to run

Install Singularity container
Download our Singularity script and generate singularity image (this image contains all Arrow related packges necessary for building/compiling BWA-MEM, Picard and GATK)
Now enter into generated image using command:
```
 sudo singularity shell <image_name>.simg
```

Download BWA-MEM inside image

 git clone https://github.com/tahashmi/bwa.git

Go into bwa dir and compile BWA-MEM:
```
 cd bwa
 make
```
Now you can run BWA-MEM.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
Singularity		Singularity
scripts		scripts
tools		tools
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArrowSAM

Citing ArrowSAM

How to run

About

Releases

Packages

Languages

abs-tudelft/ArrowSAM

Folders and files

Latest commit

History

Repository files navigation

ArrowSAM

Citing ArrowSAM

How to run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages