-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements in union matching logic during validation #1332
Changes from 11 commits
9117c6b
fcc84ec
0255211
ceb5e1b
1a6b00c
1718e8b
500a812
6645ac2
d479928
e16103a
9208282
69f6ec0
18b70d4
e4f1e6b
00ddca2
fc21636
72aa7bd
d2ef400
1585e96
d3f88b7
38f911f
c3b43e9
31a439a
ed920ad
645f917
ed8ddb9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,7 +111,9 @@ impl UnionValidator { | |
let strict = state.strict_or(self.strict); | ||
let mut errors = MaybeErrors::new(self.custom_error.as_ref()); | ||
|
||
let mut success = None; | ||
// we use this to track the validation against the most compatible union member | ||
// up to the current point | ||
let mut success: Option<(Py<PyAny>, Exactness, Option<usize>)> = None; | ||
|
||
for (choice, label) in &self.choices { | ||
let state = &mut state.rebind_extra(|extra| { | ||
|
@@ -134,16 +136,20 @@ impl UnionValidator { | |
_ => { | ||
// success should always have an exactness | ||
debug_assert_ne!(state.exactness, None); | ||
|
||
let new_exactness = state.exactness.unwrap_or(Exactness::Lax); | ||
// if the new result has higher exactness than the current success, replace it | ||
if success | ||
.as_ref() | ||
.map_or(true, |(_, current_exactness)| *current_exactness < new_exactness) | ||
{ | ||
// TODO: is there a possible optimization here, where once there has | ||
// been one success, we turn on strict mode, to avoid unnecessary | ||
// coercions for further validation? | ||
success = Some((new_success, new_exactness)); | ||
let new_fields_set = state.fields_set_count; | ||
|
||
let new_success_is_best_match: bool = | ||
success.as_ref().map_or(true, |(_, cur_exactness, cur_fields_set)| { | ||
match (*cur_fields_set, new_fields_set) { | ||
(Some(cur), Some(new)) if cur != new => cur < new, | ||
_ => *cur_exactness < new_exactness, | ||
} | ||
}); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add a comment here describing the logic? I read it as follows:
What about I think instead if we passed Maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've added the comment :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's correct re current behavior: from pydantic import BaseModel, TypeAdapter
class DictModel(BaseModel):
a: int
b: int
ta = TypeAdapter(DictModel | dict[str, int])
print(repr(ta.validate_python({'a': 1, 'b': 2})))
#> {'a': 1, 'b': 2}
print(repr(ta.validate_python({'a': '1', 'b': '2'})))
#> DictModel(a=1, b=2)
print(repr(ta.validate_python(DictModel(a=1, b=2))))
#> DictModel(a=1, b=2) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly, I think this might be a case where we suggest that users use |
||
|
||
if new_success_is_best_match { | ||
success = Some((new_success, new_exactness, new_fields_set)); | ||
} | ||
} | ||
}, | ||
|
@@ -158,7 +164,7 @@ impl UnionValidator { | |
} | ||
state.exactness = old_exactness; | ||
|
||
if let Some((success, exactness)) = success { | ||
if let Some((success, exactness, _fields_set)) = success { | ||
state.floor_exactness(exactness); | ||
return Ok(success); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably need to reset
fields_set_count
between each loop turn? (e.g. like exactness is reset on line 124).Propagating on the state like this potentially has surprising effects like
list[ModelA] | list[ModelB]
probably now gets chosen based on whether the last entry fittedModelA
orModelB
better. We could deal with that by explicitly clearingnum_fields_set
in compound validators like list validators.Overall I'm fine with this approach of using state, I think either this or
num_fields
(andexactness
) all are fiddly to implement and will likely just need lots of hardening via test cases.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behavior looks good for the list case you've mentioned above:
I think this is because I'm now clearing the relevant parts of the state, and
fields_set_count
is by defaultNone
, in which case we only considerexactness
.