-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade Pydantic to v2 #31
base: main
Are you sure you want to change the base?
Conversation
result_dict[k] = v | ||
if isinstance(v, BaseModel) and k not in resolved_object_keys: | ||
result_dict[k] = v.model_dump() | ||
validation.update(result_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change might seem a bit weird but it came up because Pydantic now treats Pydantic dataclasses and BaseModels the same way during serialization.
In Pydantic v1, calling model.dict()
with an instance of a dataclass
would not JSON serialize that dataclass, it would just return that dataclass.
This allowed for our Optimizer README example to work and resolve that dataclass instance.
e.g. with pydantic v1 this worked still:
import dataclasses
from typing import Union, Iterable
import catalogue
from confection import registry, Config
# Create a new registry.
registry.optimizers = catalogue.create("confection", "optimizers", entry_points=False)
# Define a dummy optimizer class.
@dataclasses.dataclass
class MyCoolOptimizer:
learn_rate: float
gamma: float
@registry.optimizers.register("my_cool_optimizer.v1")
def make_my_optimizer(learn_rate: Union[float, Iterable[float]], gamma: float):
return MyCoolOptimizer(learn_rate, gamma)
# Load the config file from disk, resolve it and fetch the instantiated optimizer object.
config = Config().from_disk("./config.cfg")
resolved = registry.resolve(config)
optimizer = resolved["optimizer"] # MyCoolOptimizer(learn_rate=0.001, gamma=1e-08)
However, when swapped to pydantic v2, the final line here was resolving the optimzer
to a dict.
...
optimizer = resolved["optimizer"] # {"learn_rate": 0.001, "gamma": 1e-08}
This has actually always been a bug, because previously we could not make MyCoolOptimizer
into a Pydantic model, it had to be a dataclass
(or any other class that wasn't a Pydantic model).
With this code above, we could make MyCoolOptimzer
into a Pydantic model and we'd get a Pydantic model back.
@@ -580,32 +580,32 @@ def test_schedule(): | |||
|
|||
cfg = {"@schedules": "test_schedule.v2"} | |||
result = my_registry.resolve({"test": cfg})["test"] | |||
assert isinstance(result, GeneratorType) | |||
assert isinstance(result, Iterator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the other somewhat controversial change is the removal of checks for the Generator type. Pydantic supports validating Python generators via the Iterator
and Iterable
types, but not through the GeneratorType
. This is also how you're supposed to type stuff in Python so I think this should just be a documentation update but wanted to call it out.
…on into kab/pydantic-v2-compat
@@ -946,6 +1024,8 @@ def _update_from_parsed( | |||
# Check numpy first, just in case. Use stringified type so that numpy dependency can be ditched. | |||
elif str(type(value)) == "<class 'numpy.ndarray'>": | |||
final[key] = value | |||
elif isinstance(value, BaseModel) and isinstance(final[key], BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case can occur from the above change to pass through resolved pydantic models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, we have many users asking about it!
Just to start out with some comments about the CI and mypy...
.github/workflows/tests.yml
Outdated
- os: windows-2019 | ||
python_version: "3.6" | ||
- os: ubuntu-20.04 | ||
python_version: "3.6" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(And please don't just remove python 3.6 from the CI like this!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work Kabir, thanks!
I had this installed in the background for testing and have run into a few problems with pydantic v1.10 (since you can't use v2 with thinc/spacy right now):
Or with an existing config including a morphologizer with an internal tok2vec: [components.morphologizer.model.tok2vec]
@architectures = "spacy.Tok2Vec.v2"
[components.morphologizer.model.tok2vec.embed]
@architectures = "spacy.MultiHashEmbed.v2"
width = ${components.morphologizer.model.tok2vec.encode.width}
attrs = ["ORTH", "SHAPE"]
rows = [5000, 2500]
include_static_vectors = true
|
I don't currently understand what the differences are between the unit tests with Let me temporarily add a CI test to make sure it's not something on my end. |
* use old implementation of modelfield copy * ignore type error * Update __init__.py --------- Co-authored-by: Adriane Boyd <[email protected]>
I'm trying to think if there's a better way to define/manage the |
d55faee
to
c4f78e8
Compare
.github/workflows/tests.yml
Outdated
|
||
- name: Test with spacy | ||
run: | | ||
python -m pip install spacy | ||
python -m spacy init config -p tagger tagger.cfg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment that this is only intended to be a temporary addition to the CI. confection
needs a unit test that catches this issue.
Overview
Adds support for using Pydantic v1 or v2.
All relevant Pydantic functions are extracted to utils that support Pydantic v1 and v2 style.
CI Tests run with both Pydantic v1 and v2 but type checking focuses on Pydantic v2.
Besides the dependency upgrade there is 1 significant functionality change which fixes a bug where Pydantic models cannot currently be resolved from a registered function. More details in this comment: https://github.com/explosion/confection/pull/31/files#r1248253486