Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle CSV files with a single separator throughout #3186

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

keith-hall
Copy link
Collaborator

@keith-hall keith-hall commented Jan 24, 2025

fixes #3127 and fixes #2078

better auto-detection of CSV delimiter

  • files with a tsv extension are automatically detected as tab delimited
  • other files parsed as CSV go through the following steps:
    • if the first line contains at least 3 of the same separator, it uses that separator as a delimiter
    • if the first line contains only one supported separator character, it uses that separator as a delimiter
    • otherwise it falls back to treating all supported delimiters as the delimiter

supported delimiters, in precedence order:

  • comma ,
  • semi-colon ;
  • tab \t
  • pipe |

image

keith-hall and others added 3 commits January 25, 2025 21:31
better auto-detection of CSV delimiter
- files with a tsv extension are automatically detected as tab delimited
- other files parsed as CSV go through the following steps:
  - if the first line contains at least 3 of the same separator, it uses that separator as a delimiter
  - if the first line contains only one supported separator character, it uses that separator as a delimiter
  - otherwise it falls back to treating all supported delimiters as the delimiter

 supported delimiters, in precedence order:
 - comma `,`
 - semi-colon `;`
 - tab `\t`
 - pipe `|`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Determine CSV delimiter based on header TSV highlighting doesn't work
1 participant