Introduction | Getting Started | Additional Info | Versioning | Authors | Acknowledgements
This repo contains the code used to generate analyses and generate figures for Roth, Muench et. al. This paper describes a new resource of patient-derived iPSCs bearing a 16p11.2 copy number variant, explores the potential utility of these clones, and describes the possible impact of clonal integration on iPSC-derived tissue models. I have written this README with other biologists in mind who might be interested in following up on our analyses or investigating their own integration effects.
It is divided into two sections. The names are a bit of a misnomer, and left over from an earlier revision:
- "figure5": contains a differential expression analysis of the integration-negative clones aligned with STAR and counted with htseq-count.
- "figure6": contains an independent bioinformatic comparison of integration-negative and Integration-positive clones aligned using kallisto.
The data will be made available on GEO (under embargo during revisions as of October 12, 2020).
setup.Rmd
-
Install the following
DESeq.Rmd
-
In addition to the packages required for Setup, install the following
tximport_Setup.Rmd
-
Install the following
deseq.Rmd
-
Install the following
heatmaps.Rmd
-
Install the following
barPlots.Rmd
-
In addition to the packages required for Setup, install the following
GSEA.Rmd
- GSEA .jar file
I thought it might be easier to import and document variables using this spreadsheet rather than using a .bashrc file.
Within each figure directory, the code has been broken up into several parts. You should run the code in this order:
Figure 5
- setup.Rmd
- deseq.Rmd
Figure 6
- tximport_setup.Rmd
- deseq.Rmd
- barPlots.Rmd OR heatmaps.Rmd OR GSEA.Rmd
This code is written to have a separate output file for each distinct date of run, when the date of run is defined within the userVars.csv file. This way, the user can maintain copies of all output as small tweaks are made to the code.
For the alignment and counting steps, I used one of two different aligners
-
STAR (Figure 5) with htseq-count
-
kallisto (Figure 6)
I performed both of these on the Stanford Center for Personalized Medicine Cluster. I recommend running STAR on a cluster. In theory, you should be able to run kallisto on a laptop.
I performed subsequent analyses using R and RStudio.
For the versions available, see the tags on this repository.
- Kristin Muench - GitHub: kmuench
- Thank you to PurpleBooth for the README template
- Thank you to the Bader Lab for their GSEA tutorial.
- Thank you to John Hanks at the SCPGM cluster and the team at the Stanford Functional Genomics Facility for their help supporting this work.