Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider changing name "valid.csv" and "invalid.csv" #83

Closed
JRMeyer opened this issue Feb 4, 2019 · 1 comment
Closed

Consider changing name "valid.csv" and "invalid.csv" #83

JRMeyer opened this issue Feb 4, 2019 · 1 comment
Assignees

Comments

@JRMeyer
Copy link
Contributor

JRMeyer commented Feb 4, 2019

Given that "dev" and "valid" are used interchangeably in the machine learning world to refer to the held-out dataset which is used for early stopping, I think having a valid.csv as well as a dev.csv will lead to confusion.

Given that "invalid.csv" also exists, and the filenames should make obvious that "valid.csv" and "invalid.csv" are complementary sets, I suggest the following names:

valid.csv invalid.csv
validated.csv invalidated.csv
accurate.csv inaccurate.csv
correct.csv incorrect.csv
confirmed.csv uncomfirmed.csv
verified.csv unverified.csv
validated_transcripts.csv invalidated_transcripts.csv
accurate_transcripts.csv inaccurate_transcripts.csv
correct_transcripts.csv incorrect_transcripts.csv
confirmed_transcripts.csv unconfirmed_transcripts.csv
verified_transcripts.csv unverified_transcripts.csv
@kdavis-mozilla
Copy link
Contributor

What about keeping it simple with validated.csv and invalidated.csv?

@JRMeyer JRMeyer closed this as completed in d73df91 Feb 8, 2019
JRMeyer added a commit that referenced this issue Feb 8, 2019
Fixed #83 renamed output tsv's
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants