-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ORM: Use skip_orm
as the default implementation for SqlaGroup.add_nodes
and SqlaGroup.remove_nodes
#6720
base: main
Are you sure you want to change the base?
Conversation
The existing tests related are: def test_add_nodes(self):
"""Test different ways of adding nodes."""
node_01 = orm.Data().store()
node_02 = orm.Data().store()
node_03 = orm.Data().store()
nodes = [node_01, node_02, node_03]
group = orm.Group(label='test_adding_nodes').store()
# Single node
group.add_nodes(node_01)
# List of nodes
group.add_nodes([node_02, node_03])
# Check
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes)
# Try to add a node that is already present: there should be no problem
group.add_nodes(node_01)
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes)
def test_remove_nodes(self):
"""Test node removal."""
node_01 = orm.Data().store()
node_02 = orm.Data().store()
node_03 = orm.Data().store()
node_04 = orm.Data().store()
nodes = [node_01, node_02, node_03]
group = orm.Group(label=uuid.uuid4().hex).store()
# Add initial nodes
group.add_nodes(nodes)
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes)
# Remove a node that is not in the group: nothing should happen
group.remove_nodes(node_04)
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes)
# Remove one orm.Node
nodes.remove(node_03)
group.remove_nodes(node_03)
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes)
# Remove a list of Nodes and check
nodes.remove(node_01)
nodes.remove(node_02)
group.remove_nodes([node_01, node_02])
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes) This looks sufficient to me. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6720 +/- ##
==========================================
- Coverage 78.09% 78.07% -0.01%
==========================================
Files 564 564
Lines 42544 42524 -20
==========================================
- Hits 33219 33198 -21
- Misses 9325 9326 +1 ☔ View full report in Codecov by Sentry. |
`remove_nodes`; remove skip_orm related tests;
for more information, see https://pre-commit.ci
3d7d508
to
46085a5
Compare
A test case related to cmd tools found that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'll give it another review, and then I think we can merge it. Though, I would leave this in main
for now, and not directly include it in the upcoming v2.7.0 release, and see if we encounter any issues. As the skip_orm
flag was never really used anywhere or documented, we don't really have experience with the implementation over the years (although the implementation looks solid, indeed).
@@ -177,6 +181,10 @@ def test_remove_nodes(self): | |||
group.remove_nodes([node_01, node_02]) | |||
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering as, at this point, the nodes
list is already an empty list and the group.nodes
already an empty Group
, if it's really necessary to compare the sets of pk
s, and if it might be worth initially adding another node to avoid comparing two empty collections here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You got a point. I'd say that another case, where at least one node is remained in the group, should be added, and meanwhile this case is also worth being kept, because it covers the scenario of removing all elements of a group (boundary testing).
assert set(_.pk for _ in nodes) == set(_.pk for _ in group.nodes) | ||
|
||
|
||
def test_add_nodes_skip_orm_batch(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we still should have this check? Did you just remove all _skip_orm_
tests here, and keep the default tests, assuming that if they pass, then the updated implementation should be fine, as well, or was there a deeper meaning to removing this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just that I found the function of both test group are overlapping. I didn't find a scenario that is only covered the skip_orm
tests but in default tests. However, we could add a benchmark for large batches insertion/removal, as the main reason we switch to this implementation is to increase the performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops I just notice that I ignored the batch_size
arguments, as this argument seems to be passed through directly to the lower layers. I'll look into it tonight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's why I asked bc I didn't see the batch_size
argument anywhere in the remaining tests.
This PR removes the
skip_orm
flag and adopts the optimized approach behind it as the default behavior for theadd_nodes
andremove_nodes
methods.As discussed in #5453, the ORM-based implementations of
add_nodes
andremove_nodes
were found to be inefficient, and to address this, theskip_orm
flag was introduced in b40b2ca and bced84e. This flag enabled a faster, core API-based approach for these operations.Over the years, no users have reported bugs related to this method, and the existing tests are sufficiently comprehensive to ensure its safety. Consequently, it is a reasonable decision to make the
skip_orm
implementation the default for bulk insertion and removal of elements within a group.Closes #5453.