-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[migration] metadata for dashboard filters #9109
[migration] metadata for dashboard filters #9109
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9109 +/- ##
=========================================
- Coverage 59.1% 59.1% -0.01%
=========================================
Files 372 372
Lines 11920 11922 +2
Branches 2917 2919 +2
=========================================
+ Hits 7045 7046 +1
- Misses 4693 4694 +1
Partials 182 182
Continue to review full report at Codecov.
|
e29866a
to
9331512
Compare
dd1e53d
to
24206ab
Compare
24206ab
to
cd093b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly 🐍 nits. How long does this take to run?
viz_type = Column(String(250)) | ||
|
||
|
||
dashboard_slices = Table( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for consistency, can you make this the same format as Slice
and Dashboard
?
class DashboardSlices(Base):
__tablename__ = "dashboard_slices"
id = Column(Integer, primary_key = True)
...and so on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a few places in our code base that use variable (instead of class) for relationship table.
slice_ids = [slice.id for slice in dashboard.slices] | ||
filters = ( | ||
session.query(Slice) | ||
.filter(and_(Slice.id.in_(slice_ids), Slice.viz_type == "filter_box")) | ||
.all() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the query for filters is necessary, since the slices shouuuld have all the info loaded, so you could just do something like:
filters = [slice for slice in dashboard.slices if slice.viz_type == "filter_box"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it works! thank you.
) | ||
|
||
# if dashboard has filter_box | ||
if len(filters): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slightly prefer
if filters:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
|
||
dashboards = session.query(Dashboard).all() | ||
for i, dashboard in enumerate(dashboards): | ||
print("scanning dashboard ({}/{}) >>>>".format(i + 1, len(dashboards))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't have a strong preference for which, but we should probably either stick to print
or logging.info
and not do both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in my test, scanning 11000 records takes ~4 minutes.
this print
some progress in the console so that during deployment we can see how much work is done. Otherwise it will look like db upgrade is hanging there.
the logging.info is used to save a record (can't see it from console).
|
||
session.merge(dashboard) | ||
except Exception as e: | ||
logging.exception(f"dashboard {dashboard.id} has error: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you tested this, were there ever any errors? I'm curious what kind of errors might occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mean to catch any user data caused null pointer exception. for example, user may have added invalid json_metada
that I can't parse ? but should be very rare.
slice_params = filter_slice.params or {} | ||
configs = slice_params.get("filter_configs") or [] | ||
|
||
if slice_params.get("date_filter", False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if slice_params.get("date_filter", False): | |
if slice_params.get("date_filter"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
cd093b3
to
159164a
Compare
slice_ids = [slice.id for slice in dashboard.slices] | ||
filters = [ | ||
slice | ||
for slice in dashboard.slices | ||
if slice.id.in_(slice_ids) and slice.viz_type == "filter_box" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the extra slice_ids step - can just do.
filters = [slice for slice in dashboard.slices if slice.viz_type == "filter_box"]
b396284
to
2f427c6
Compare
2f427c6
to
1a048c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@graceguo-superset could you add a line item in UPDATING.md
with the warning you mentioned in the PR description.
|
||
json_metadata.pop("filter_immune_slices", None) | ||
json_metadata.pop("filter_immune_slice_fields", None) | ||
if filter_scopes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this logic can be put within the if filters:
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
logging.info( | ||
f"Adding filter_scopes for dashboard {dashboard.id}: {json.dumps(filter_scopes)}" | ||
) | ||
if json_metadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t we also need to update the JSON metadata of the record if it’s falsey?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct! I fixed it.
some old dashboards have no json_metadata. so after this migration i want to keep their json_metadata attribute be None
instead of {}
"filter_immune_slice_fields", {} | ||
).items(): | ||
for column in columns: | ||
if immuned_by_column.get(column, None) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets use a collections.defaultdict
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
ae97bf7
to
008ce3f
Compare
ping @john-bodley |
slice_params = json.loads(filter_slice.params or "{}") | ||
configs = slice_params.get("filter_configs") or [] | ||
|
||
if slice_params.get("date_filter"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know this is the exhaustive list of slice parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about to add an iteration for all the possible date filter related keys. But think this function will be used by:
- db migration, only run once,
- dashboard import, to convert
filter_immune
metadata in old dashboard to use newfilter_scopes
Given the usage is kind of backward compatible not future compatible, I feel enumeration probably is good enough?
CATEGORY
Choose one
SUMMARY
Migrate
filter_immune_slices
andfilter_immune_filter_fields
since we enabled dashboard scoped filter metadatafilter_scopes
.TEST PLAN
We will not copy old
filter_immune_slices
andfilter_immune_filter_fields
metadata. If upgrade failed, users will lose filter immune settings in their dashboards.So before system admin do this upgrade, please backup
dashboards
table.ADDITIONAL INFORMATION
REVIEWERS
@serenajiang @etr2460 @mistercrunch