-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 encoding of csv #17
Comments
Thanks for reporting the issue.
Would you mind sharing a CSV that fails? The best option may be to correct this behavior. |
I should have mentioned that it depends on the ordering of the columns as to whether or not it fails... If you take the I think this is because the spreadsheet program puts a UTF-8 byte order mark (BOM) at the start of the file, so D47crunch can't find the column headings it expects to see. |
In that case, the best case scenario is that another field gets a spurious BOM in its name, which is likely to cause cryptic problems down the line. If you send me an example file I can try to debug. The longer-term solution, of course, it to use a more robust csv parsing implementation. There are probably many out there. On the other hand, you could argue that producing clean csv files is Excel's job, not D47crunch's. Implementing corrective measures in every library likely to encounter this format is expensive. Converting “quasi-CSV” to ”bare-CSV” between Excel and these libraries would probably be less expensive. YMMV. |
I agree 100% that it is an Excel problem not a D47crunch problem... I just thought it might be discouraging for a relatively new python user trying to get their D47crunch / D47calib script working. A simple error msg informing the user that they need to clean up their filthy csv file could do the job? See file attached: |
Just a suggestion regarding the documentation...
It seems that
D47data.read()
will fail if the input csv file is in UTF-8 encoding. This is not a problem with D47crunch, but may trip up some new users if they were to produce their csv file using the default csv format of certain spreadsheet programs.Would it be worth putting a warning about this in section 1.2 of the documentation just in case?
The text was updated successfully, but these errors were encountered: