Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datalib is inferring my string as a number #95

Open
ferndot opened this issue May 24, 2018 · 3 comments
Open

Datalib is inferring my string as a number #95

ferndot opened this issue May 24, 2018 · 3 comments

Comments

@ferndot
Copy link

ferndot commented May 24, 2018

Given the following TSV file, datalib is inferring the name column to be a number.

Example TSV:

owner_slug	slug	aggregate	name	square_footage
company_demo	1bf8caed-89d0-4547-b1f9-feac7d72e91b	TRUE	Restaurant 11057	3000

Datalib call:

datalib.tsv(
  {
    url: 'example.tsv'
  },
  function (error, data) {
    if (error) {
      console.log(error)
    } else {
      console.log(data)
    }
  }
)
@jheer
Copy link
Member

jheer commented May 24, 2018

Thanks for the bug report. When I attempt to reproduce, I find that the type inference methods are inferring the name column to be a date, for which a timestamp number is then produced.

Strangely enough, the browser's built-in Date.parse method (at least on Chrome and in Node.js) successfully parses the example string value to a date:

new Date(Date.parse('Restaurant 11057'))
// Thu Jan 01 11057 00:00:00 GMT-0800 (PST)

Fixing this will likely require significant changes to how Date inference is performed (as we currently leverage the results from Date.parse). In the meantime, I recommend explicitly providing the desired column types to datalib rather than relying on type inference.

@ferndot
Copy link
Author

ferndot commented Sep 27, 2018

@jheer: we could easily fix this by using Moment.js. Here is a very simple example: http://jsfiddle.net/zcvxsbo2/2/. We could also see if a more modern and small library like date-fns or d3-time-format (which is already included), would work.

This would also make the date parser more robust, consistent, and able to support more formats.

I can provide a patch if you'd like 😄

@jonathanzong
Copy link
Member

Hi! I just wanted to see if there had been any changes to Date inference since this discussion. In Lyra, we've been loading datasets from vega-datasets through datalib and noticing a few incorrect type inferences to do with date. If there's currently no plans to revisit this issue I can potentially look into it at some point, but want to make sure I'm not duplicating the work of someone more familiar with this library first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants