Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Deserialization that only uses FromStr as a fallback #702

Closed
Scripter17 opened this issue Feb 14, 2024 · 3 comments
Closed

Comments

@Scripter17
Copy link

(Putting this here instead of the official serde repo because that one has an issue from 2018 that still hasn't been resolved.)

The serde documentation provides a way to tell a struct to deserialize a field from either FromStr::from_str or Deserialize::deserialize, but as far as I can tell the only ways to have the type in the field always have this behaviour is to manually implement deserialize or have a dummy type.

DeserializeFromStr almost handles this, but trying to derive both this and serde's normal Deserialize doesn't work.

For my particular use case, I have a struct that does lazy regex stuff and I would quite like to deserialize it from both {"pattern": "...", ...} and "..." without having to put #[serde(deserialize_with = "string_or_struct")] every time my wrapper appears in a field.

Effectively I want something like this:

use serde_with::DeserializeFromStrOrStruct;
use serde::Deserialize;

#[derive(DeserializeFromStrOrStruct)]
struct RegexWrapper {
    ...
}

impl FromStr for RegexWrapper {...}

#[derive(Deserialize)]
struct ThingThatContainsRegex {
    // Note the lack of any #[serde(deserialize_with = "...")]
    blah: RegexWrapper,
    ...
}
@jonasbb
Copy link
Owner

jonasbb commented Feb 18, 2024

It is possible to write code that does what you want. From a code perspective, the best place to add it would be the serde_derive macro, since that would allow placing the visit_map and visit_str functions next to each other, without any wrappers.
You can write your own Deserialize implementation and then delegate to derive generated code using MapAccessDeserializer. That deserialize implementation would be possible to be generated by a derive macro.

With a small modification, it should be possible to have that as a derive macro. Generate a RegexWrapperRemote as an exact copy of RegexWrapper, apply the #[serde(remote="...")] to it, and use it in the Deserialize implementation.

However, I am not sure if this StringOrStruct derive is that important overall.


/*
[dependencies]
serde.version = "*"
serde.features = ["derive"]
serde_json = "*"
*/

use serde::Deserialize;
use serde::Deserializer;
use std::str::FromStr;

#[derive(Debug, Deserialize)]
#[serde(remote = "Self")]
struct RegexWrapper {
    pattern: String,
}

impl FromStr for RegexWrapper {
    type Err = String;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        Ok(Self {
            pattern: s.to_string(),
        })
    }
}

impl<'de> Deserialize<'de> for RegexWrapper {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        struct V;

        impl<'de> serde::de::Visitor<'de> for V {
            type Value = RegexWrapper;

            fn expecting(&self, _f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
                todo!()
            }

            fn visit_str<E: serde::de::Error>(self, s: &str) -> Result<Self::Value, E> {
                FromStr::from_str(s).map_err(|e| E::custom(e))
            }

            fn visit_map<M: serde::de::MapAccess<'de>>(
                self,
                map: M,
            ) -> Result<Self::Value, M::Error> {
                RegexWrapper::deserialize(serde::de::value::MapAccessDeserializer::new(map))
            }
        }

        deserializer.deserialize_any(V)
    }
}

#[derive(Debug, Deserialize)]
struct ThingThatContainsRegex {
    blah: RegexWrapper,
}

fn main() {
    let _ = dbg!(serde_json::from_str::<ThingThatContainsRegex>(
        r#"{"blah": "foobar"}"#
    ));
    let _ = dbg!(serde_json::from_str::<ThingThatContainsRegex>(
        r#"{"blah": {
            "pattern": "foobar"
        }}"#
    ));
}

@Scripter17
Copy link
Author

Scripter17 commented Mar 3, 2024

Finally got around to trying this and, uh, what?

How does this not cause a "trait implemented twice" error, or any error for that matter. How does #[serde(remote = "Self")] not explode into a stack overflow? Is this supposed to work or is it just a consequence of weird implementation details? How the hell does this just magically work???

And perhaps most importantly and answerably, is there a way to make the serialize derive macro not break with this? I do also need it to work with enums if that matters.

Oh, by the way, thank you so much. I was not ready to handle writing a deserializer for HashMap<String, OtherTypeWithSameFromStrThing> and this is a lifesaver.

@jonasbb
Copy link
Owner

jonasbb commented Mar 3, 2024

The remote implementation does not create an impl Deserialize. Instead it adds a deserialize directly to the RegexWrapper struct. That means if you write RegexWrapper::deserialize you call the inherent function first and you need additional qualification if you want to call the deserialize function of the impl Deserialize for RegexWrapper. The impl Deserialize just calls the inherent function, avoiding any recursion issues.

The Serialize will "break" and there is no way to avoid it, as it will now only create an inherent function. You can create a minimal impl Serialize which just forwards to that function.

This remote = "Self" is a special case, but it can come in handy if you want to use the generated implementation but adjust them a bit. You can use cargo expand (available on the playground) to see the code after macro expansion.

@jonasbb jonasbb closed this as completed Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants