-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splitting the segmented bam file? #143
Comments
There is currently no built-in way to split the segmented bam into files for each barcode. Since the molecules being ligated together are chosen randomly I'm not sure why you would want to split them by the MAS-adapter sequence - it shouldn't make a difference in the read content.
|
@jonn-smith thanks for your comment. We are planning to use the MAS Isoseq on ~100 case control single cell cDNA samples. However we are thinking of mixing case and control in one concat product. If we can identify the mas adapter then we can segregate the case control readsin downstream. Thants the reason why we Re looking for it. thanks |
@ashokpatowary That's great! It's an interesting use case. I'll put it on the todo list. I'm not sure when I can get to it - probably not for a couple of weeks. In the meantime, everything already exists in the output from |
Thanks @jonn-smith for the suggestion. We just generated 1st batch of test data. We are also not in very hurry. Still trying to understand that data generated. I am having a general follow up question. I try to look for distribution of the leading MAS adapter within the SG tag. It looks something like this
I am not understanding from the wet lab experimental point of view why are we seeing so many reads with Another question regarding the segment step. I have modified the json file as the UMI length was 12 bp. What this warning about
|
@ashokpatowary What we have noticed is that the first MAS adapter can be missing from the beginning of an array read. I don't know why the abundance of the For QC and metrics on your data, I suggest running |
@ashokpatowary For the UMI / Cell barcode warning - the latest version of If you can post the model json you're using I might be able to see what is missing to cause it to not find the CBC and UMI (if your model has them). |
@jonn-smith I have updated the longbow version on August 5th. So, I think I am using the latest version of the tool. I have modify one line i the built in longbow_model_mas_15_sc_10x3p model
However when I checked the segmet output file; I can see the
|
@ashokpatowary Yes - those are low-confidence barcode and UMI tags. You will need to do post-processing to refine them. We have done a few things in the past that have worked well to do this refinement, but there are a lot of options out there. As far as the tags go, Further down in the model you'll see a section for |
Will it be possible to have a read tag for the MAS adapter used in the segmented bam file so that we can split the bam file if needed? Like if we are using 10 MAS adapter for ligation; can we split the segmented bam file into 10 independent bam file?
Also what is the memory requirement for the "logbow correct" step? Its continuously giving memory error.
Thanks
The text was updated successfully, but these errors were encountered: