-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time grain support for unix-timestamp columns #1093
Conversation
Also worth mentioning that, the other way to work with unix timestamps in Caravel is to create a custom column with a SQL expression converting timestamp to datetime. The main drawback for that approach is performance. We usually have index created on time columns, but It's impossible for DB to use index with the SQL function invoked on the left side of comparison in where clauses. |
Patch looks good to me, i'd move the reference to the sqlite3 issue to the commit message instead of code as hopefully we can fix that in a timely way. |
Thanks, @xrmx. I changed the code to use another datetime conversion function which won't be affected by |
@@ -172,12 +172,9 @@ def get_df(self, query_obj=None): | |||
raise Exception("No data, review your incantations!") | |||
else: | |||
if 'timestamp' in df.columns: | |||
if timestamp_format == "epoch_s": | |||
if timestamp_format in ("epoch_s", "epoch_ms"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the previous approach was more explicit and not relying on magic as much. If someone was to plot historical event in the far past or close to 1970 the magic (or pandas guessing the unit) might not hold as well.
I'd leave the file as is was if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old way is more explicit, but won't work. That's because unix timestamp has been converted to a datetime type, so either unit='s'
or unit='ms'
will throw error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't explain very clearly in my previous comment.
Pandas doesn't need to guess whether it's in seconds or milliseconds here after my change, because the data has already be converted to datetime type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha
Apart from the inline comment I made this LGTM. [outside the scope of this PR sidenote] we'll need to bring all the logic around different db engines in one place eventually as we're starting to have a fair amount of it all over the code base... |
@mistercrunch I've replied your inline comments. There doesn't seem any further actions I need take. If my reply looks good to you, could you merge this branch? |
Since we've already converted unix epoch to datetime type, we shouldn't specify 'unit' parameter in pandas.to_datetime
Thanks for the PR! |
* Add time grain support for time columnd in unix timestamp * Fix datetime parsing for unix epoch Since we've already converted unix epoch to datetime type, we shouldn't specify 'unit' parameter in pandas.to_datetime * Fix SQLite timestamp to datetime conversion
Caravel has basic support for using unix-timestamp as time column when using 'epoch_s' or 'epoch_ms' in python date format. But those unix-timestamp columns don't work well with time grains such as hour/day/week/month, which makes them almost useless.
This PR adds time grain support for unix-timestamp columns. If a integer/real column stores unix timestamp data, you could toggle
is temporal
flag and setPython date format
asepoch_s
(orepoch_ms
if in milliseconds). And now you could use it as time column in any time series visualization, and could select any time grain.Below is a screenshot of using
![image](https://cloud.githubusercontent.com/assets/58540/18432854/0363a350-7917-11e6-9699-74509bcc25fb.png)
epoch_s
to render a calendar view with examplemulti_format_time_series
table