Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds updated exclusion criterion for transcripts #3

Merged
merged 63 commits into from
May 8, 2020

Conversation

cgpu
Copy link

@cgpu cgpu commented Apr 29, 2020

This PR updates the first filtering criterion for keeping or discarding transcripts based on zero values in RSEM abundance estimates, as discussed with @karleg and @pnrobinson.

The suggestion was to take into account the N, number of individuals in the minority class (control or case) and use a percentage of this class (eg. 10%) to set the tolerated fraction of observed zeros across this class's measured subjects.

The code below describes how the exclusion criterion should be implemented and applied:

num.case=length(case)
num.control=length(control)
N_sample=min(num.case,num.control)
countsData=countsData[rowSums(countsData[,3:(2+num.control)]>0)>=N_sample*0.9,]
countsData=countsData[rowSums(countsData[,(2+num.control+1):ncol(countsData)]>0)>=N_sample*0.9,]

main.nf Outdated Show resolved Hide resolved
nextflow.config Outdated Show resolved Hide resolved
@cgpu
Copy link
Author

cgpu commented Apr 29, 2020

Also I would change the condition to strictly greater (> instead of >=) because we want it to take effect only if we have more than 10 samples in each condition

👍 implemented in 69fb678.

@cgpu cgpu requested a review from karleg April 29, 2020 15:06
cgpu added 6 commits April 29, 2020 16:28
- Updated docker image sha256:ff3d87fbeb1d5dbabae1d3adcd81cfa284ba291a61ea993491fad6862ec9cb47
- Adresses incidents of SRR samples failing eg. because not paired-end
- Re-ordering for metadata after filtering out SRR that failed in previous steps
conf/base.config Outdated Show resolved Hide resolved
cgpu and others added 24 commits May 7, 2020 02:32
* Allocates 10 of 32 cpus for hbadeals() (with 4 of 32 cpus it worked)

CloudOS link with successful job with 4 vCPUS:
https://cloudos.lifebit.ai/public/jobs/5eb3bc3d1795470103deadba

* Triggers first tests for fork; Adds https:// (secure)
* Applies linting suggestions from failed linting tests

* Adds asset to remove inline <img> html

* Adds dev container (nf-core) style (to keep nf-core tests happy)

* Sets reads to default:false

* Adds sarek logic; include hbadeals in tools

* Migrates processes of label:hbadeals inside rsem if clause

* Refactoring hbadeals processes and channels positions
@cgpu
Copy link
Author

cgpu commented May 8, 2020

@karleg @adeslatt @pnrobinson Tests passing, going forth with merge!

@cgpu cgpu merged commit 4b7414c into aws May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants