-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
readdlm and fixed column data #5391
Comments
You need to specify a delimiter as second argument in
seems to work fine. |
If you specify a delimiter then the padding spaces inside the (fixed width) fields are interpreted as separate columns. Thus the rows then have a different number of columns. The original file is: I get a BoundsError: page_blocks = readdlm("page-blocks.data", ' ') ERROR: BoundsError() |
I see. Seems There is also
which gave
Maybe someone else (cc: @johnmyleswhite) has an idea? |
We don't support fixed width fields yet. It's not that hard: I might even be able to finish a demo on the way to work today. But fixed width files have almost nothing in common with delimited files, so our existing infrastructure is only slightly usable. |
Maybe we could add an option to readdlm to skip empty columns; that might handle cases like this. |
Well, maybe readdlm is not appropriate for reading fixed width data but I just want to read my data. For this file I converted it into a comma separated file and could read it. It would be nice to be able to read it without the conversion. |
I've started the work of doing this in JuliaData/DataFrames.jl#475. I started with binary files, which were more useful to me. I'll get to text files soon. |
PR #5400 addresses the issue of default delimiters not being applied and the BoundsError when there are empty columns. With this patch |
address readdlm default delimiter and boundserror. ref: #5391
updated tests and docs. fixes JuliaLang#5391
…ingle delimiter. fixed bug in handling empty columns. updated tests and docs. fixes JuliaLang#5391
…ingle delimiter. fixed bug in handling empty columns. updated tests and docs. fixes JuliaLang#5391
…ingle delimiter. fixed bug in handling empty columns. updated tests and docs. fixes JuliaLang#5391
…ingle delimiter. fixed bug in handling empty columns. updated tests and docs. fixes JuliaLang#5391
I have a file with fixed column data. The first few lines are:
5 7 35 1.400 .400 .657 2.33 14 23 6 1
6 7 42 1.167 .429 .881 3.60 18 37 5 1
6 18 108 3.000 .287 .741 4.43 31 80 7 1
If I use
page_blocks = readdlm("page-blocks.data")
it reads every row as a single string.
If I use
page_blocks = readdlm("page-blocks.data", Float64)
it reads the data as a long array of NaNs.
If I use
page_blocks = readdlm("page-blocks.data",(Int64, Int64,Int64,Float64,Float64,Float64,Float64,Int64,Int64,Int64, Int64))
I get the error:
ERROR: file entry " 5 7 35 1.400 .400 .657 2.33 14 23 6 1" cannot be converted to (Int64,Int64,Int64,Float64,Float64,Float64,Float64,Int64,Int64,Int64,Int64)
in error at error.jl:21
in dlm_fill at datafmt.jl:135
in readdlm_string at datafmt.jl:82
in readdlm_auto at datafmt.jl:50
in readdlm at datafmt.jl:42
in readdlm at datafmt.jl:35
How can I read fixed column data with readdlm?
The text was updated successfully, but these errors were encountered: