-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create new IntersectCorrespondingFields operator #1531
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two issues:
(1) This is not backward compatible, what if anyone using it now?
(2) The same behavior can be achieved with current implementation:
Intersect(field_to_field={"label": "label", "name": "name"}, allowed_values=["b", "f"])
Am I missing something?
This relates to a need (e.g in ner datasets) to filter multiple fields together ( e.g keep only two entity types and remove the start ,end, and label fields). The current code changes are not backwards compatible (e.g hard code the field to " labels"). I'll write more details later on how to address. |
Signed-off-by: Yoav Katz <[email protected]>
Signed-off-by: Yoav Katz <[email protected]>
To keep things backward compatible, but still consistent - I've add a new Operator called IntersectCorrespondingFields. """Intersects the value of a field, which must be a list, with a given list , and removes corresponding elements from other list fields.
|
In this PR Intersect() operator is updated. It allows us to filter out some values from data when running benchmarks. Additionally, test were updated and rewritten for more readability.