Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(chart): Supporting custom SQL as temporal x-axis column with filter #25126

Merged
merged 14 commits into from
Sep 18, 2023
13 changes: 6 additions & 7 deletions superset-frontend/src/explore/actions/exploreActions.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -22,20 +22,19 @@ import exploreReducer from 'src/explore/reducers/exploreReducer';
import * as actions from 'src/explore/actions/exploreActions';

describe('reducers', () => {
it('sets correct control value given an arbitrary key and value', () => {
it('Does not set a control value if control does not exist', () => {
const newState = exploreReducer(
defaultState,
actions.setControlValue('NEW_FIELD', 'x', []),
);
expect(newState.controls.NEW_FIELD.value).toBe('x');
expect(newState.form_data.NEW_FIELD).toBe('x');
expect(newState.controls.NEW_FIELD).toBeUndefined();
});
it('setControlValue works as expected with a checkbox', () => {
it('setControlValue works as expected with a Select control', () => {
const newState = exploreReducer(
defaultState,
actions.setControlValue('show_legend', true, []),
actions.setControlValue('y_axis_format', '$,.2f', []),
);
expect(newState.controls.show_legend.value).toBe(true);
expect(newState.form_data.show_legend).toBe(true);
expect(newState.controls.y_axis_format.value).toBe('$,.2f');
expect(newState.form_data.y_axis_format).toBe('$,.2f');
});
});
18 changes: 10 additions & 8 deletions superset-frontend/src/explore/reducers/exploreReducer.js
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ export default function exploreReducer(state = {}, action) {
const vizType = new_form_data.viz_type;

// if the controlName is metrics, and the metric column name is updated,
// need to update column config as well to keep the previou config.
// need to update column config as well to keep the previous config.
if (controlName === 'metrics' && old_metrics_data && new_column_config) {
value.forEach((item, index) => {
if (
Expand All @@ -129,11 +129,11 @@ export default function exploreReducer(state = {}, action) {
}

// Use the processed control config (with overrides and everything)
// if `controlName` does not existing in current controls,
// if `controlName` does not exist in current controls,
const controlConfig =
state.controls[action.controlName] ||
getControlConfig(action.controlName, vizType) ||
{};
null;

// will call validators again
const control = {
Expand All @@ -149,7 +149,7 @@ export default function exploreReducer(state = {}, action) {
...state,
controls: {
...state.controls,
[controlName]: control,
...(controlConfig && { [controlName]: control }),
...(controlName === 'metrics' && { column_config }),
},
};
Expand Down Expand Up @@ -196,10 +196,12 @@ export default function exploreReducer(state = {}, action) {
triggerRender: control.renderTrigger && !hasErrors,
controls: {
...currentControlsState,
[action.controlName]: {
...control,
validationErrors: errors,
},
...(controlConfig && {
[action.controlName]: {
...control,
validationErrors: errors,
},
}),
...rerenderedControls,
},
};
Expand Down
7 changes: 6 additions & 1 deletion superset/common/query_context_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
from superset.daos.chart import ChartDAO
from superset.daos.datasource import DatasourceDAO
from superset.models.slice import Slice
from superset.utils.core import DatasourceDict, DatasourceType
from superset.utils.core import DatasourceDict, DatasourceType, is_adhoc_column

if TYPE_CHECKING:
from superset.connectors.base.models import BaseDatasource
Expand Down Expand Up @@ -128,6 +128,8 @@ def _apply_granularity(

if granularity := query_object.granularity:
filter_to_remove = None
if is_adhoc_column(x_axis): # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why we need # type: ignore here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the x_axis is from form_data and it's ANY

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a good idea to broaden the scope of the type guard to Any so it can handle arbitrary types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type guard is better to scope down rather than scope up right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zephyring @villebro I think this PR introduced another potential bug in Adhoc column as X-Axis, you should sent a probe query to ensure x-axis is a temporal column.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @zhaoyongjie can you share more on what potential bug this may've introduced?

x_axis = x_axis.get("sqlExpression")
if x_axis and x_axis in temporal_columns:
filter_to_remove = x_axis
x_axis_column = next(
Expand Down Expand Up @@ -175,6 +177,9 @@ def _apply_granularity(
# another temporal filter. A new filter based on the value of
# the granularity will be added later in the code.
# In practice, this is replacing the previous default temporal filter.
if is_adhoc_column(filter_to_remove): # type: ignore
filter_to_remove = filter_to_remove.get("sqlExpression")

if filter_to_remove:
query_object.filter = [
filter
Expand Down
2 changes: 2 additions & 0 deletions superset/connectors/sqla/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -1007,6 +1007,8 @@ def adhoc_column_to_sqla( # pylint: disable=too-many-locals
qry = sa.select([sqla_column]).limit(1).select_from(tbl)
sql = self.database.compile_sqla_query(qry)
col_desc = get_columns_description(self.database, self.schema, sql)
if not col_desc:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dealing with case where custom_sql can be empty string and col_desc is empty

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should handle this on frontend and not let users save a custom sql that's an empty string

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kgabryje I agree - however, as it will be lots of work to clean up old metadata, I think we can maybe have both?

Copy link
Author

@zephyring zephyring Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should handle this edge case in the backend anyways. I will be nice to prevent user from saving empty custom sql in the frontend and show some friendly message. But for now if the custom_sql is any arbitrary string the API responds with 500 with error message directly from DB engine, which is "fine" and readable for users but just this empty string it throws index out of bound error to user which is not helpful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can create a separate PR for UI messaging and prevention though

raise SupersetGenericDBErrorException("Column not found")
is_dttm = col_desc[0]["is_dttm"] # type: ignore
except SupersetGenericDBErrorException as ex:
raise ColumnNotFoundException(message=str(ex)) from ex
Expand Down
66 changes: 66 additions & 0 deletions tests/integration_tests/charts/data/api_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
from superset.superset_typing import AdhocColumn
from superset.utils.core import (
AnnotationType,
backend,
get_example_default_schema,
AdhocMetricExpressionType,
ExtraFiltersReasonType,
Expand Down Expand Up @@ -943,6 +944,71 @@ def test_chart_data_get(self):
assert data["result"][0]["status"] == "success"
assert data["result"][0]["rowcount"] == 2

@pytest.mark.usefixtures("load_birth_names_dashboard_with_slices")
def test_chart_data_get_with_x_axis_using_custom_sql(self):
"""
Chart data API: Test GET endpoint
"""
chart = db.session.query(Slice).filter_by(slice_name="Genders").one()
chart.query_context = json.dumps(
{
"datasource": {"id": chart.table.id, "type": "table"},
"force": False,
"queries": [
{
"time_range": "1900-01-01T00:00:00 : 2000-01-01T00:00:00",
"granularity": "ds",
"filters": [
{"col": "ds", "op": "TEMPORAL_RANGE", "val": "No filter"}
],
"extras": {
"having": "",
"where": "",
},
"applied_time_extras": {},
"columns": [
{
"columnType": "BASE_AXIS",
"datasourceWarning": False,
"expressionType": "SQL",
"label": "My column",
"sqlExpression": "ds",
"timeGrain": "P1W",
}
],
Comment on lines +960 to +978
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the "granularity" and "Axis-column" shouldn't be concurrent sent from frontend. the "granularity" is a legacy Druid "concept"(and this concept as be wrapped under the hood in "modern" Druid), so from my original design would like to totally remove "granularity" in Superset, the "granularity" means that a temporal column in columns(or dimensions).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "time_range" and "filters" in query_object is same logic as before.

Copy link

@zero-element zero-element Oct 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the "granularity" and "Axis-column" shouldn't be concurrent sent from frontend. the "granularity" is a legacy Druid "concept"(and this concept as be wrapped under the hood in "modern" Druid), so from my original design would like to totally remove "granularity" in Superset, the "granularity" means that a temporal column in columns(or dimensions).

I'm trying to fix a bug related to granularity. Basically, the temporal granularity column should be renamed as "__timestamp" like what the legacy time series line chart did before(alias after group by should not be the same with the raw column name in clickhouse) but current chart don't, so granularity is a little different from other temporal columns(at least for users). If the "granularity" should be removed, should this bug be fixed on the frontend?

it seems like the time column as x-axis has been completely unusable for a very long time. thanks to this PR made custom SQL work

"metrics": ["sum__num"],
"orderby": [["sum__num", False]],
"annotation_layers": [],
"row_limit": 50000,
"timeseries_limit": 0,
"order_desc": True,
"url_params": {},
"custom_params": {},
"custom_form_data": {},
}
],
"form_data": {
"x_axis": {
"datasourceWarning": False,
"expressionType": "SQL",
"label": "My column",
"sqlExpression": "ds",
}
},
"result_format": "json",
"result_type": "full",
}
)
rv = self.get_assert_metric(f"api/v1/chart/{chart.id}/data/", "get_data")
assert rv.mimetype == "application/json"
data = json.loads(rv.data.decode("utf-8"))
assert data["result"][0]["status"] == "success"

if backend() == "presto":
assert data["result"][0]["rowcount"] == 41
else:
assert data["result"][0]["rowcount"] == 40

@pytest.mark.usefixtures("load_birth_names_dashboard_with_slices")
def test_chart_data_get_forced(self):
"""
Expand Down