-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flattening Structs #315
Comments
Is |
Yeah, would be nice to be able to do |
Hi! Support for flattening structs would be hard. It's doable, but not easily - there's a bunch of edge cases that can pop up as features are mixed together. I'd be happy to write up what makes this hard if you're interested, but in short I don't have plans to add this feature. That said, I'm curious about your use case. Why do you want to flatten the runtime structure here? Why not write out the full structure of In [8]: class Option(msgspec.Struct):
...: x: int # made up some fields for here
...:
In [9]: class Data(msgspec.Struct):
...: options: list[Option]
...:
In [10]: class Foo(msgspec.Struct):
...: version: int
...: data: Data
...:
In [11]: msg = """
...: {
...: "version": 1,
...: "data": {
...: "options": [{"x": 1}, {"x": 2}]
...: }
...: }
...: """
In [12]: msgspec.json.decode(msg, type=Foo)
Out[12]: Foo(version=1, data=Data(options=[Option(x=1), Option(x=2)])) |
Thanks for the answer! The reason for this is mostly because of an opinionated approach to an API wrapper I am working on. The data field for this payload feels a bit cluncky and useless, as it doesn't really contain much, but just makes things harder to access, specially due to Options containing more Options:
It was a choice we went with when implementing this part of the API for simplicity sake. When I opened the issue my idea for this was something along the lines of: class Foo(msgspec.Struct):
version: int
option: Option = msgspec.field(location="data__option") a little side effect here would also be allowing a syntax to rename attributes For some quick dump of info because this idea has been coming and going in my head, the syntax would go something like this:
Which I believe should cover all usecases for this. A tricky case I also thought about would be: {
"data": {
"option": {}
},
"data__": {
"option": {}
}
} class Obj(msgspec.Struct):
data: Data
data__: MoreData
data_option: Option = msgspec.field(location="data__option")
more_data_option: Option = msgspec.field(location="data____option")
# or (which would be equivalent)
some_data: Data = msgspec.field(location="data")
some_more_data: Data = msgspec.field(location="data__")
data_option: Option = msgspec.field(location="data__option")
more_data_option: Option = msgspec.field(location="data____option") In this case, the data fields will properly resolve and the distinction between flattening the stuct or not will be dictated based on whether the key exists or not, taking priority the first one. For extreme cases that I don't believe can really be found in the wild, an extra arg to force a location to be treated as a flattenener could be added too. I understand this could be a lot more work than is actually usefully, but I just wanted to dump the idea. I unfortunately don't have the C skills to try and implemt this myself, but would love to try. Also interested in the limitations that you mentioned, as they might render my whole idea useless, as lack information on the internals of msgspec 😅 |
The rename mechanism could be probably used for this (from the point of view of the user of the lib), something like this: class Option(msgspec.Struct):
x: int
foo_names= {
"options": ["data", "options"], # for example, TBD
}
class Foo(msgspec.Struct, rename=foo_names):
version: int
options: list[Option] AFAIK Pydantic will support flattening in V2: class Foo(BaseModel):
bar: str = Field(aliases=[['baz', 2, 'qux']]) They have probably thought about edge cases, so it might be worth looking into as a good starting point. |
@davfsa If it's only about making the parsed objects more usable, what about simply: class Foo(msgspec.Struct):
version: int
data: ...
@property
def options(self):
return self.data.options You might even hide the original data field, having it renamed to e.g. |
I have a similar use case. This is what the data looks like: {
"username": "jcrist",
"attributes": [
{"Name": "first_name", "Value": "Jim"},
{"Name": "last_name", "Value": "Crist"},
...
]
...
}
I'd like to model it such that the attribute keys (like class User(Struct):
username: str
first_name: str
last_name: str
msgspec.json.decode(data, type=User)
# > MyUser(username='jcrist', first_name='Jim', last_name='Crist') Even if I created a new (PS: Not sure if this is the right issue to ask this; it seemed very similar to mine, but also slightly different because there's a level of...indirection(?), where the relevant key-value pairs are 'hidden' under the For reference, I found a solution to a similar problem using Pydantic's |
Also met a similar case, I think these schema of data would happens frequently at a GraphQL API. {
"data":{
"issues":{
"nodes":[
{
"id":"12345"
},
{
"id":"67890"
}
]
}
}
} Thanks for @ml31415 that #315 (comment) helps a lot, but I still need to define 4 one-line-structs to express it. I would be really grateful if there could be a native support. |
What you could do is create tagged attribute objects. Then msgspec can distinguish them and you can add some verification. class Attribute(msgspec.Struct, tag_field="Name")
pass
class Firstname(Attribute, tag="first_name"):
Value: str # add validation for first_name here as required
class Lastname(Attribute, tag="last_name"):
Value: str # separate validation for last_name goes
Attribute = Firstname | Lastname
class User(msgspec.Struct):
username: str
attributes: list[Attribute] Otherwise, if it's just about making the object easier to access, instead of modifying the data, just again use property. Roughly like that: class User(msgspec.Struct):
username: str
attributes: list[Attribute]
def _attribute_dict(self):
return {attr.Name.lower(): attr.Value for attr in self.attributes}
def __getattr__(self, attr):
try:
return self._attribute_dict()[attr]
except KeyError:
raise AttributeError(attr) |
Hi @cutecutecat , from typing import Literal
class Node(msgspec.Struct):
id: int
class Container(msgspec.Struct):
data: dict[Literal["issues"], dict[Literal["nodes"], list[Node]]]
@property
def nodes(self):
return self.data["issues"]["nodes"]
|
I'm currently working on a Docker API client and flattening would be really useful. For example, we have a struct like this: class ServiceSpec(Struct):
name: str
labels: dict[str, str]
image: str
environment: list[str] And Docker expects something like this: {
"Name": "web",
"Labels": {"com.docker.example": "string"},
"TaskTemplate": {
"ContainerSpec": {
"Image": "nginx:alpine",
"Env": ["SECRET_KEY=123"]
}
}
} To achieve this, I currently use the following hack: Codeclass DockerContainerSpec(Struct):
image: str = field(name="Image")
environment: list[str] = field(name="Env")
@classmethod
def from_spec(cls, spec: ServiceSpec):
obj = msgspec.convert(spec, cls, from_attributes=True)
return obj
class DockerTaskTemplate(Struct):
_container_spec: DockerContainerSpec = field(default=None, name="ContainerSpec")
@classmethod
def from_spec(cls, spec: ServiceSpec):
obj = msgspec.convert(spec, cls, from_attributes=True)
obj._container_spec = DockerContainerSpec.from_spec(spec)
return obj
class DockerService(Struct):
name: str = field(name="Name")
labels: dict[str, str] = field(name="Labels")
_task_template: DockerTaskTemplate = field(default=None, name="TaskTemplate")
@classmethod
def from_spec(cls, spec: ServiceSpec):
obj = msgspec.convert(spec, cls, from_attributes=True)
obj._task_template = DockerTaskTemplate.from_spec(spec)
return obj This is a bit clumsy, but works out fairly well: >>> spec = ServiceSpec(
... name="app",
... labels={},
... image="nginx:alpine",
... environment=["HELLO=world"]
... )
>>> msgspec.json.encode(DockerService.from_spec(spec))
b'{"Name":"app","Labels":{},"TaskTemplate":{"ContainerSpec":{"Image":"nginx:alpine","Env":["HELLO=world"]}}}' UPD: this can be refactored as a wrapper for |
Description
This is more of a question than a feature request, but could turn into one.
One of my uses when it comes to deserialising something similar to:
into
I have scoured through the documentation and can't find an easy way to do this. The way I have managed currently is by deserialising the Struct to a dict and then parsing the JSON as a dict (using attrs), but would like to move away from it to reduce the amount of code to maintain (the reason I have been looking at msgspec, appart from the obvious speed gains!)
Thanks!
The text was updated successfully, but these errors were encountered: