-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathproj_requirement.txt
27 lines (20 loc) · 2.38 KB
/
proj_requirement.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
UI for RNA-seq quantification*
High throughput RNA-seq data contain millions of reads which are aligned/mapped to the reference transcriptome with a mapper. Subsequently, a tool is used to report the expression level of different transcripts in the experimental sample corresponding to the reads. The output from the quantifier (such as RSEM) typically consists of many numerical results, and little visual aid to help interpret these results. In this project we aim to make a rich, interactive, web interface, which serves a data-exploration tool for the end-user (usually a Biologist or Bioinformatician). The visualizations here should be interactive, implemented, most likely, using a javascript framework.
Given the appropriate input (a set of abundance estimates — one of which may be the “ground-truth”) the framework should:
1. Generate plots and tables to explore different quality metrics
e.g.
- number of reads mapped in the experiment
- transcript GC content versus abundance estimates
- transcript length versus abundance estimates
- and other possible confounding factors versus abundance estimates
2. Create interactive plots, in the sense that a user should be able to access a data point (which is a transcript here) and by clicking on it, obtain abundance measure across all the methods / experiments etc. If the provided quantification estimates have samples or variances associated with the abundance, the user should be able to plot and explore these interactively as well by selecting a transcript or set of transcripts.
3. If the user specifies that one of the provided abundance results is the “truth” (e.g. in a simulated dataset), the framework should compute a collection of different accuracy metrics and compare all other estimates according to these. Some such metrics are
pearson correlation,
spearman correlation,
proportionality correlation,
mean absolute error,
true positive fraction and
false positive rate.
These should be provided in the form of a nicely formatted table where appropriate, and if a specific result is chosen, plots (e.g. scatter plots) should be generated on demand.
4. Remaining credit will be granted for useful and interesting visualizations and metrics that you come up with.
The goal here is really to provide an interactive, aesthetically pleasing, and useful tool for the interactive exploration of RNA-seq abundance estimation results.