Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union types complex #1549

Closed
wants to merge 20 commits into from
Closed

Conversation

idbentley
Copy link
Contributor

@idbentley idbentley commented Jul 17, 2024

Q A
Bug fix? no
New feature? yes
Doc updated no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets
License MIT

This pull request is part two, building on #1546.

I made some assumptions while making this change, and am very open to feedback. I've tested the Union functionality pretty thoroughly by running my external project's test suite against it, so I'm fairly confident that things work in general, but because of my inexperience with jms/serializer there may be many interactions that I've missed.

This adds Union support for objects in 2 ways:

  1. Discriminated Unions

This is the easy case. For objects that have a field which specifies their type, i.e.:

class DiscriminatedAuthor
{
    /**
     * @Type("string")
     * @SerializedName("full_name")
     */
    #[Type(name: 'string')]
    #[SerializedName(name: 'full_name')]
    private $name;

    /**
     * @Type("string")
     */
    #[Type(name: 'string')]
    private $type = 'JMS\Serializer\Tests\Fixtures\DiscriminatedAuthor';
    ..snip...
}

I added a new Annotation: UnionDiscriminator that allows you to use this field to determine which object to deserialize into.

    #[UnionDiscriminator(field: 'type')]
    private DiscriminatedAuthor|DiscriminatedComment $data;

For more complex cases, the UnionDiscriminator Annotation can also receive a map attribute, which will map from the values contained in the discriminator field, to the fully qualified class.

i.e.

    #[UnionDiscriminator(field: 'objectType', map: ['author' => 'JMS\Serializer\Tests\Fixtures\DiscriminatedAuthor', 'comment' => 'JMS\Serializer\Tests\Fixtures\DiscriminatedComment'])]
    private DiscriminatedAuthor|DiscriminatedComment $data;

Known issues with Discriminated Unions:

  • @Type annotations don't support unions (I looked into modifying the lexer/parser, but it's pretty non trivial to add).
  • There's no support for "top level" deserialization for discriminated unions, because there's no way to annotate the call to ->deserialize. i.e. they are only supported when they are Properties on objects.
  1. Non-Discriminated unions

There is no "correct way" to deserialize into Union Types without discriminated fields, so I borrowed an approach I've seen in other languages (I recently added a similar implementation to the Python dataclasses_json package, and this approach is also similar to TypeScript's zod package).

Essentially, in order to deserialize a Union, I "try" to deserialize into the possible types one by one until I find a type that the data matches well, and I assume that is the correct type.

Step 1:
I Sort the objects based on the number of required properties on the object (descending from most required properties to least) primitives are always sorted first.

Step 2: Attempt to deserialize into each type

Deserialization may fail for a couple reasons:

If JSON contains data of a type that doesn't match the expected type from the Object.

If the JSON does not contain a required property from the Object (This behaviour is configurable in the UnionHandler constructor - but for resolving potential conflicts, it's essentially mandatory).

Notes:
If the JSON contains a field that the Object doesn't have a matching field for, then we just ignore that field - I don't treat that as a failure.

The bulk of the code supporting this change can be seen in the UnionHandler and the DerserializationGraphNavigator.

The enumeration of possible exceptions in UnionHandler seems like a point of fragility.

The introduction of requireAllRequiredProperties field on the JsonDeserializationVisitor is a bit peculiar - any feedback on a more idiomatic way to add this functionality is welcomed.

@scyzoryck @goetas - sorry that this PR is so big! I hope you'll agree it's worth the time to add.

@JustSteveKing
Copy link

This will be 🔥

Copy link
Collaborator

@goetas goetas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I go in a more detailed review, could you please answer my first comment?

I have to say that this looks great, thanks for the excellent work!

$this->accessor->setValue($object, $v, $propertyMetadata, $this->context);
} catch (NotAcceptableException $e) {
if (true === $propertyMetadata->hasDefault) {
$cloned = clone $propertyMetadata;
$cloned->setter = null;
$this->accessor->setValue($object, $cloned->defaultValue, $cloned, $this->context);
} elseif (!$allowsNull && $this->visitor->getRequireAllRequiredProperties()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should handle this here... could you please explain why do you need this check at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should handle this here... could you please explain why do you need this check at this point?

I added this check to make the normal deserialization process is less permissive, resulting in more accurate type detection.

Specifically, if the object you are deserializing into contains a required field, and the JSON does not contain that field, the existing deserializer will deserialize without complaint. While, that behaviour could be desirable in some cases, and presumably must be maintained for backwards compatibility, it's undesirable for deserializing non-discriminated unions.

For non-discriminated unions, we use a "guess and check" approach to deserialization, and the absence of a required field is a strong signal that the data we're deserializing may better match a different type.

class SimplePerson {
    public string $name;
    public ?string $nickname;
}

class SimpleCar {
    public string $model;
    public ?string $nickname;
}

class WrapperClass {
  public SimplePerson|SimpleCar $entity
}  

If we want to deserialize the following data:

{
  "entity": {
    "name": "ian"
  }
}

Without such a rule, the resulting class is determined by the order of application. If you consider SimplePerson first, then all is well. If you consider SimpleCar first, then the deserialization model skips name because it doesn't exist in the class, and deserialization will succeed returning a null SimpleCar.

By adding the rule, when SimpleCar is tested, the required model attribute is not present, so deserialization will fail.

I agree that the implementation is awkward, and would be happy to refactor!

Copy link
Collaborator

@goetas goetas Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really avoid implementing such usecase. It is too specific and I'm really not sure if it worth the complexity.

{
  "entity": {
    "type": "simple_person",
    "name": "ian"
  }
}

A type filed should be always present in order to match efficiently the class name to use.

The same type of discussion were taken when implementing the @DiscrminatorMap for inheritance and even in that case any other solution was just not worth the effort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth the effort, and I don't think it's a very specific usecase.

While, of course deserializing discriminated unions is easier, it's not the only type of data that exists. It's really easy to say "just change the data", but in many cases, that isn't possible - for example: if someone's deserializing data from an external API. Anecdotally in my situation, this case comes up regularly.

In order to provide good support for PHP 8, jms/serializer should have good support for union types,.

I'll note that other deserialization libraries do provide this functionality: dataclasses_json, pydantic, zod are just three examples.

I've provided an implementation that is opt-out (by not using the UnionHandler altogether, or you can construct the UnionHandler with requireAllProperties=false to avoid the new failure logic). Is the concern the added code complexity? I'm happy to refactor the code.

However, if you still think this isn't an appropriate feature fro the jms/serializer - I would ask that at least we provide a workaround for users. If we could include the ability to requireAllRequiredFields when deserializing, it would allow me to write a custom UnionHandler outside of jms/serializer.

@idbentley
Copy link
Contributor Author

I moved the substantive changes into smaller PRs, and removed the non-discriminated union deserialization support: #1554 and #1553 .

@idbentley idbentley closed this Jul 26, 2024
@idbentley idbentley reopened this Jul 29, 2024
@idbentley idbentley closed this Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants