Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using "TIME COLUMN" along with "X-AXIS" for same value for Clickhouse #23384

Closed
3 tasks done
NikoJ opened this issue Mar 15, 2023 · 13 comments
Closed
3 tasks done
Labels
#bug Bug report data:connect:clickhouse Related to Clickhouse

Comments

@NikoJ
Copy link

NikoJ commented Mar 15, 2023

When using Mixed Chart I encountered an Error when using TIME COLUMN along with X-AXIS for the same value when the source is Clickhouse.

The error looks like:

Error: :HTTPDriver for http://xx.xxx.x.84:8123 returned response code 500) Code: 215. DB::Exception: Column `dt` is not under aggregate function and not in GROUP BY. Have columns: ['count()','toStartOfDay(toDateTime(toStartOfDay(toDateTime(dt))))']: While processing toStartOfDay(toDateTime(dt)) AS dt, count() AS ...

I found a mistake in the query generation for Clickhouse:

SELECT toStartOfDay(toDateTime(`dt`)) AS `dt`,
   	COUNT(*) AS `count`
FROM
  (SELECT dt,
      	id,
      	message
   FROM default.test_table_v2) AS `virtual_table`
GROUP BY toStartOfDay(toDateTime(`dt`))
ORDER BY `count` DESC
LIMIT 10000;

The alias specified in the select SELECT toStartOfDay(toDateTime(dt)) AS dt is then substituted for the section with grouping GROUP BY toStartOfDay(toDateTime(toStartOfDay(toDateTime(dt)))).

How to reproduce the bug

  1. Create Database with Clickhouse Connect
  2. Create table
CREATE TABLE IF NOT EXISTS demo(
id UInt64,
message String,
dt Date
) ENGINE = MergeTree()
ORDER BY id;
  1. Insert test data
INSERT INTO demo SELECT abs(toInt8(rand(2))) as id, fuzzBits('Demo',0.01) as message, '2023-02-05' as dt FROM numbers(100);
INSERT INTO demo SELECT abs(toInt8(rand(2))) as id, fuzzBits('Demo',0.01) as message, '2023-02-12' as dt FROM numbers(100);
INSERT INTO demo SELECT abs(toInt8(rand(2))) as id, fuzzBits('Demo',0.01) as message, '2023-03-01' as dt FROM numbers(100);

  1. Create Dataset based on clickhouse data
  2. Create Chart based on Dataset from p. 5 with chart type `Mixed Chart
  • TIME COLUMN = dt
  • TIME GRAIN = Day
  • X-AXIS = dt
  • Query A METRICS = SUM(id)
  • Query B METRICS = SUM(id)

see Screenshots

Expected results

In my opinion there are two ways. Or generate a query with a different alias or substitute an alias in the grouping section.

SELECT toStartOfDay(toDateTime(`dt`)) AS `dt_`,
   	COUNT(*) AS `count`
FROM
  (SELECT dt,
      	id,
      	message
   FROM default.demo) AS `virtual_table`
GROUP BY toStartOfDay(toDateTime(`dt`))
ORDER BY `count` DESC
LIMIT 10000;
SELECT toStartOfDay(toDateTime(`dt`)) AS `dt`,
   	COUNT(*) AS `count`
FROM
  (SELECT dt,
      	id,
      	message
   FROM default.demo) AS `virtual_table`
GROUP BY `dt`
ORDER BY `count` DESC
LIMIT 10000;

Screenshots

i1

Environment

  • browser type and version: Firefox/110.0
  • superset version: 2.0.1
  • python version: 3.8.12
  • superset-helm-chart-0.8.5
  • clickhouse-connect==0.5.3

Checklist

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.
@NikoJ NikoJ added the #bug Bug report label Mar 15, 2023
@ktncktnc
Copy link

This annoying bug happens all the time. I hope it will be fixed soon.

@danielG1beat
Copy link

danielG1beat commented Jun 7, 2023

I have it too 🙁

@eriendeng
Copy link
Contributor

alias after group by should not be the same with the raw column name.

@NikoJ
Copy link
Author

NikoJ commented Aug 3, 2023

I know but such query generates Superset. I used a trick to solve this problem but it's inconvenient

@nikolastojkoski
Copy link

nikolastojkoski commented Aug 23, 2023

I have the same issue on version 2.1.0 when trying to use timestamp as dimension in Aggregate Table.

I used a trick to solve this problem but it's inconvenient

@NikoJ can you share your workaround?

@Rydberg95
Copy link

Unfortunately still here in 3.0.0.

image

@Rydberg95
Copy link

Rydberg95 commented Sep 21, 2023

A workaround I found is actually to edit the Date column that is related to the error to be defined as a "Custom SQL" function, like this:

image

Make sure that the column has the function icon after it's edited (for me i had to input a blank space and remove it before saving in above step. When it is a function, it looks like below, with the f(x) icon at the start.

image

This seems to mitigate the error, and so far is not producing any side effects noticed by me. Obviously not a solution worth a PR, but could be an interim solution.

EDIT:
This didn't work for another case, when I tried using a date column as the x-axis of a line chart. I fiddled around some more, and giving the column an alias withing Superset (see below) seems to be another way similar to above interim solution.
image

@chriscomeau79
Copy link

Still present in 3.0.1. The workaround above with setting a custom expression and alias works.
Similar situation where we're trying to use a DateTime64 column as X axis and it's grouping by day, with inconsistent use of aliases in the generated query.

@kovitals
Copy link

kovitals commented Apr 10, 2024

Can be fixed on ClickHouse side, with override default value for property prefer_column_name_to_alias

via user setting for default profile

<clickhouse>
<profiles>
<default>
<prefer_column_name_to_alias>1</prefer_column_name_to_alias>
....
</default>
...

@rusackas
Copy link
Member

Thanks @kovitals - if anyone else on this thread finds this to be a sufficient answer (or even a sufficient workaround), this would be a great documentation update on the Clickhouse docs page. A PR from anyone would be helpful on that point to close this out.

@NikoJ
Copy link
Author

NikoJ commented Apr 15, 2024

@rusackas Thanks! When I have free time, I will check it and then close this issue as solved if everything is ok.

@Rydberg95
Copy link

Can be fixed on ClickHouse side, with override default value for property prefer_column_name_to_alias

via user setting for default profile

<clickhouse>
<profiles>
<default>
<prefer_column_name_to_alias>1</prefer_column_name_to_alias>
....
</default>
...

Thanks, seems to work great!

As an alternative, if someone else wants to add the setting via SQL-driven workflow, the below command sets up a new settings profile and assigns it to a user:

CREATE SETTINGS PROFILE superset SETTINGS prefer_column_name_to_alias = 1 TO *your_clickhouse_username*

@rusackas
Copy link
Member

rusackas commented Jul 9, 2024

It seems this has been fixed on the CH side and this is no longer an issue. Closing, but happy to reopen if it's still an issue.

@rusackas rusackas closed this as completed Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug Bug report data:connect:clickhouse Related to Clickhouse
Projects
None yet
Development

No branches or pull requests

9 participants