Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: allow NaN handling to be specified per column in read_csv() #20877

Closed
jowagner opened this issue Apr 30, 2018 · 2 comments
Closed

Comments

@jowagner
Copy link
Contributor

Setting na_values = [], keep_default_na = False seems to be the way to go to read data with string columns. (The default behaviour is to stay according to a comments in issue #15669.) However, if the data also contains number columns the user may want to process NaNs is those columns, for example:

import pandas
import io
pandas.read_csv(io.StringIO("""col1,col2
1.23,NA
NA,NB
"""), dtype=str, na_values=[], keep_default_na=False)
   | col1 | col2
-- | ---- | ----
 0 | 1.23 | NA
 1 | NA   | NB

The parameters should be extended so that one can specify the NaN treatment for each column, or better for subsets of columns. I see that @HHest also made this suggesting in a comment in issue #15669.

@WillAyd
Copy link
Member

WillAyd commented Apr 30, 2018

Unless I'm missing something with your request this is already supported and mentioned in the read_csv documentation:

In [3]: import pandas
   ...: import io
   ...: pandas.read_csv(io.StringIO("""col1,col2
   ...: 1.23,NA
   ...: NA,NB
   ...: """), dtype=str, na_values={'col1': ['NA'], 'col2': []}, keep_default_na
   ...: =False)

Out[3]: 
   col1 col2
0  1.23   NA
1   NaN   NB

@jowagner
Copy link
Contributor Author

jowagner commented May 1, 2018

You are right. Thanks.

At least now we have your code example here on a page with keywords that people are likely to search when they have the same problem.

I'll add clarification to my documentation issue #20875.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants