Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bedtools intersect truncates bed12 intervals #919

Closed
ddubocan opened this issue Jun 2, 2021 · 7 comments · Fixed by #1034
Closed

Bedtools intersect truncates bed12 intervals #919

ddubocan opened this issue Jun 2, 2021 · 7 comments · Fixed by #1034

Comments

@ddubocan
Copy link

ddubocan commented Jun 2, 2021

Hi,

I am working with the bed12 file format. When I perform bedtools intersect with the -wa flag, my intervals get truncated in the output file. Typically the truncation occurs in the 12th field, leading me to believe its some issue with handling longer lines.

Thanks in advance for the help,
Danilo

@arq5x
Copy link
Owner

arq5x commented Jun 2, 2021

Which version are you using and can you provide an example command and dataset that produces the problem?

@ddubocan
Copy link
Author

ddubocan commented Jun 2, 2021

bedtools v2.30.0

bedtools intersect -wa -a example_bed12.bed -b region.bed

example_bed12.bed:

chr1 46626685 46645317 1527 4.01 . 46626685 46645317 128,0,128 102 1,104,89,167,139,135,140,141,107,94,153,178,132,162,163,147,131,133,187,146,201,242,98,156,171,167,163,153,143,143,156,143,155,119,142,139,157,94,201,163,163,160,157,154,129,147,163,158,151,170,133,144,146,151,168,132,319,205,149,132,161,153,146,140,166,130,153,151,100,179,158,156,189,178,141,130,136,168,146,176,157,124,131,166,175,152,201,184,111,123,212,148,169,340,166,260,141,154,156,139,159,1 0,99,223,403,601,794,938,1124,1324,1527,1633,1787,2021,2176,2347,2522,2752,2931,3079,3276,3434,3636,3879,3987,4146,4352,4550,4740,4915,5069,5235,5441,5600,5780,5919,6097,6264,6490,6599,6805,6969,7177,7384,7559,7766,7929,8077,8275,8469,8640,8859,9012,9163,9341,9544,9713,9902,10222,10447,10600,10788,10969,11149,11314,11479,11699,11882,12065,12297,12431,12625,12793,12955,13145,13344,13517,13681,13822,14018,14205,14382,14580,14733,14910,15094,15272,15462,15667,15852,16022,16173,16406,16593,16827,17168,17335,17690,17834,18082,18264,18442,18631

region.bed:

chr1 43250001 47250000

The output looks like:

chr1 46626685 46645317 1527 4.01 . 46626685 46645317 128,0,128 102 1,104,89,167,139,135,140,141,107,94,153,178,132,162,163,147,131,133,187,146,201,242,98,156,171,167,163,153,143,143,156,143,155,119,142,139,157,94,201,163,163,160,157,154,129,147,163,158,151,170,133,144,146,151,168,132,319,205,149,132,161,153,146,140,166,130,153,151,100,179,158,156,189,178,141,130,136,168,146,176,157,124,131,166,175,152,201,184,111,123,212,148,169,340,166,260,141,154,156,139,159,1 0,99,223,403,601,794,938,1124,1324,1527,1633,1787,2021,2176,2347,2522,2752,2931,3079,3276,3434,3636,3879,3987,4146,4352,4550,4740,4915,5069,5235,5441,5600,5780,5919,6097,6264,6490,6599,6805,6969,7177,7384,7559,7766,7929,8077,8275,8469,8640,8859,9012,9163,9341,9544,9713,9902,10222,10447,10600,10788,10969,11149,11314,11479,11699,11882,12065,12297,12431,12625,12793,12955,13145,13344,13517,13681,13822,14018,14205,14382,14580,14733,14910,15094,15272,15462,15667,15852,16022,16173,16406,16593,16827,17168,17335,17690,17834,18082,182

You can see that the last 3 values from the 12th field are missing/cutoff.
Thanks

@arq5x
Copy link
Owner

arq5x commented Jun 2, 2021

Thanks. I am unable to reproduce this on my laptop. Is it possible that your terminal is truncating the output? You could test this by writing the output to a file and checking the contents

@ddubocan
Copy link
Author

ddubocan commented Jun 2, 2021

That interval is pulled directly from a file. I've been redirecting stdout to a file.

It very well may be the system I am using, I will try the same command on a different system and see if that fixes this issue.

Thanks again for your prompt help!

@arq5x arq5x closed this as completed Jun 3, 2021
@vinzenzmay
Copy link

Hi, I have the exact same problem, both on Ubuntu 20.04, bash 5.1.16(1), bedtools v2.30.0; and Rocky Linux 8.5, bash 4.4.20(1), bedtools v2.30.0.
@ddubocan have you found a solution?

@cwhelan
Copy link

cwhelan commented Jan 29, 2023

I've also experienced similar behavior, with bedtools v2.30.0 on Ubuntu 22.04/bash 5.1.16(1). In my case I had encoded a long string in the fourth (name) column of the bed interval and that field was truncated after 1004 characters. If there's additional information I can provide that would be helpful let me know.

@arq5x
Copy link
Owner

arq5x commented Feb 2, 2023

@brentp could you test this with the version in main? I am fairly certain this is fixed, motivating the urgency for me to make a new formal release

brentp added a commit to brentp/bedtools2 that referenced this issue Feb 2, 2023
arq5x added a commit that referenced this issue Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants