Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep specific element to single line while using indent with serde serializer #655

Closed
ShaddyDC opened this issue Sep 21, 2023 · 8 comments · Fixed by #820
Closed

Keep specific element to single line while using indent with serde serializer #655

ShaddyDC opened this issue Sep 21, 2023 · 8 comments · Fixed by #820
Labels
enhancement help wanted serde Issues related to mapping from Rust types to XML

Comments

@ShaddyDC
Copy link

Hi, thanks for making this library; I'm getting a lot of use out of it.
However, I've run into an issue where I have to format some code a certain way that a proprietary reader expects it to be. Namely, I've got these structures:

#[derive(Deserialize, Serialize, Debug)]
pub struct TextureCoordinates {
    #[serde(rename = "@dimension")]
    pub dimension: String,
    #[serde(rename = "@channel")]
    pub channel: String,
    #[serde(rename = "$value")]
    pub elements: String,
}
#[derive(Deserialize, Serialize, Debug)]
#[serde(rename_all = "PascalCase")]
pub struct VertexBuffer {
    pub positions: String,
    pub normals: String,
    #[serde(skip_serializing_if = "Option::is_none", default)]
    pub texture_coordinates: Option<TextureCoordinates>,
}

I'm writing like this, though admittedly with some context missing:

let mut ser = Serializer::with_root(&mut buffer, None).unwrap();
ser.indent(' ', 2);
ser.expand_empty_elements(true);

obj.serialize(ser).unwrap();

The texture coordinates serialise to something like this:

<VertexBuffer>
  <Positions>319.066 -881.28705 7.71589</Positions>
  <Normals>-0.0195154 -0.21420999 0.976593</Normals>
  <TextureCoordinates dimension="2D" channel="0">
    0.752494 0.201033,0.773967 0.201033
  </TextureCoordinates>
</VertexBuffer>

However, the program I need to serialize for requires this format instead:

<VertexBuffer>
  <Positions>319.066 -881.28705 7.71589</Positions>
  <Normals>-0.0195154 -0.21420999 0.976593</Normals>
  <TextureCoordinates dimension="2D" channel="0">752494 0.201033,0.773967 0.201033</TextureCoordinates>
</VertexBuffer>

Is there some way to achieve this that I'm not aware of?

I'm currently working around it by just not using indentation, so everything is written in a single line, but it would be nice to have.

@Mingun Mingun added enhancement serde Issues related to mapping from Rust types to XML help wanted labels Sep 21, 2023
@Mingun
Copy link
Collaborator

Mingun commented Sep 21, 2023

The difference between Positions and TextureCoordinates is that the first is a (serde) primitive type, while the second is a struct, which potentially can have other elements inside. But in this specific case we a able to inspect all fields of the serialized struct understand that all of them are attributes and do not make indent for them.

Anyway, writing of struct content is buffered, so we should be able to understand if intendation is needed or not and apply it if required.

I would happy to merge a PR that do that.

@Mingun
Copy link
Collaborator

Mingun commented Oct 6, 2024

Some investigation. Currently the type

struct Xml {
  tag1: String,// = "value1"
  #[serde(rename = "$text")] // or $value
  text: String,// = "text"
  tag2: String,// = "value2"
}

will be serialized to such XML when using indent with 2 chars:

<some-surrounding-tag>
  <tag1>value1</tag1>
  text
  <tag2>value2</tag2>
</some-surrounding-tag>

The $text field (or $value field that will be serialized without surrounding tags) will be indented. Probably the following result will be more expected (i. e. indent all tags except the $text fields):

<some-surrounding-tag>
  <tag1>value1</tag1>text<tag2>value2</tag2>
</some-surrounding-tag>

If tag1 or tag2 tag will missing, the following result is expected:

<some-surrounding-tag>text<tag2>value2</tag2>
</some-surrounding-tag>
<some-surrounding-tag>
  <tag1>value1</tag1>text</some-surrounding-tag>

@ShaddyDC, @RedIODev and anyone else who is interested could confirm, that those expectations are correct?

@RedIODev
Copy link
Contributor

RedIODev commented Oct 6, 2024

Not sure.

<some-surrounding-tag>
  <tag1>value1</tag1>
  text
  <tag2>value2</tag2>
</some-surrounding-tag>

Looks correct to me. But I'm not familiar what the correct formatting should be.
Is there something in the xml spec to relate to? If not what behavior do other xml formatters choose?

Great regards
RedIODev

@Mingun
Copy link
Collaborator

Mingun commented Oct 6, 2024

Java's Jackson XML gives such result:

<Xml>
  <tag1>value1</tag1>text
  <tag2>value2</tag2>
</Xml>

If comment tag fields, then:
tag1 commented:

<Xml>text
  <tag2>value2</tag2>
</Xml>

tag2 commented:

<Xml>
  <tag1>value1</tag1>text
</Xml>

Both:

<Xml>text</Xml>

So, it does not write indent before text field, but writes after. Or: tag always indented.

Code
package ru.mingun.xml.test;

import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import com.fasterxml.jackson.dataformat.xml.annotation.JacksonXmlProperty;
import com.fasterxml.jackson.dataformat.xml.annotation.JacksonXmlText;
/*
<dependency>
  <groupId>com.fasterxml.jackson.dataformat</groupId>
  <artifactId>jackson-dataformat-xml</artifactId>
  <version>2.9.7</version>
</dependency>
*/
public class Xml {
  @JacksonXmlProperty
  String tag1;
  @JacksonXmlText
  String text;
  @JacksonXmlProperty
  String tag2;

  public static void main(String[] args) throws Exception {
    final XmlMapper mapper = new XmlMapper();
    mapper.enable(SerializationFeature.INDENT_OUTPUT);
    final Xml bean = new Xml();
    bean.tag1 = "value1";
    bean.text = "text";
    bean.tag2 = "value2";
    System.out.println(mapper.writeValueAsString(bean));
  }
}

@Mingun
Copy link
Collaborator

Mingun commented Oct 6, 2024

Java's XmlBeans:

<Xml>
  <tag1>value1</tag1>
  text
  <tag2>value2</tag2>
</Xml>

Without tag1:

<Xml>
  text
  <tag2>value2</tag2>
</Xml>

Without tag2:

<Xml>
  <tag1>value1</tag1>
  text
</Xml>

Without tag1 and tag2:

<Xml>text</Xml>

So current behaviour almost the same, as XmlBeans, except the text only case.

Code
package ru.mingun.xml.test;

import org.apache.xmlbeans.XmlObject;
/*
<dependency>
  <groupId>org.apache.xmlbeans</groupId>
  <artifactId>xmlbeans</artifactId>
  <version>3.1.0</version>
</dependency>
*/
public class Xml {
  public static void main(String[] args) throws Exception {
    System.out.println(XmlObject.Factory.parse("<Xml><tag1>value1</tag1>text<tag2>value2</tag2></Xml>"));
    System.out.println(XmlObject.Factory.parse("<Xml>text<tag2>value2</tag2></Xml>"));
    System.out.println(XmlObject.Factory.parse("<Xml><tag1>value1</tag1>text</Xml>"));
    System.out.println(XmlObject.Factory.parse("<Xml>text</Xml>"));
  }
}

@Mingun
Copy link
Collaborator

Mingun commented Oct 6, 2024

Python:

<Xml>
  <tag1>value1</tag1>text<tag2>value2</tag2>
</Xml>

Without tag1:

<Xml>text<tag2>value2</tag2>
</Xml>

Without tag2:

<Xml>
  <tag1>value1</tag1>text</Xml>

Without tag1 and tag2:

<Xml>text</Xml>

The same, as I proposed in the last comment

Code
import xml.etree.ElementTree as ET

root = ET.fromstring("<Xml><tag1>value1</tag1>text<tag2>value2</tag2></Xml>")
ET.indent(root)
print(ET.tostring(root, encoding='unicode'))

root = ET.fromstring("<Xml>text<tag2>value2</tag2></Xml>")
ET.indent(root)
print(ET.tostring(root, encoding='unicode'))

root = ET.fromstring("<Xml><tag1>value1</tag1>text</Xml>")
ET.indent(root)
print(ET.tostring(root, encoding='unicode'))

root = ET.fromstring("<Xml>text</Xml>")
ET.indent(root)
print(ET.tostring(root, encoding='unicode'))

@ShaddyDC
Copy link
Author

ShaddyDC commented Oct 7, 2024

I don't have much experience with xml, so I can only answer for the narrow use case I have, and testing it is a bit of a pain, so I'll answer based on what I believe to be the case instead.

I think this example which you gave would be correct for me.

<some-surrounding-tag>
  <tag1>value1</tag1>text<tag2>value2</tag2>
</some-surrounding-tag>

The problem was that having newlines around "text" would parse them as part of the text. I assume that would not be a problem with this example, and it looks otherwise as expected to me.

I am not confident about that, however, but for my specific use case it is enough if this works:

<some-surrounding-tag attribute="test1">text2</some-surrounding-tag>

I'm sorry that I cannot provide much broader feedback.

@Mingun
Copy link
Collaborator

Mingun commented Oct 7, 2024

C# (mono implementation):

<?xml version="1.0" encoding="us-ascii"?>
<Xml xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <tag1>value1</tag1>text<tag2>value2</tag2></Xml>

Without tag1:

<?xml version="1.0" encoding="us-ascii"?>
<Xml xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">text<tag2>value2</tag2></Xml>

Without tag2:

<Xml xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <tag1>value1</tag1>text</Xml>

Without tag1 and tag2:

<?xml version="1.0" encoding="us-ascii"?>
<Xml xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">text</Xml>

Writing text disables indent of next fields, but it is still applied to the fields before the text field

Code
using System;
using System.Xml;
using System.Xml.Serialization;

public class Xml
{
    public String tag1;
    [XmlText]
    public String text;
    public String tag2;

    public static void Main(string[] args)
    {
        var s = new XmlSerializer(typeof(Xml));
        using (var xmlWriter = XmlWriter.Create(Console.Out, new XmlWriterSettings { Indent = true }))
        {
            s.Serialize(xmlWriter, new Xml {
                tag1 = "value1",
                text = "text",
                tag2 = "value2",
            });
        }
    }
}

Mingun added a commit to Mingun/quick-xml that referenced this issue Oct 11, 2024
Mingun added a commit to Mingun/quick-xml that referenced this issue Oct 12, 2024
Mingun added a commit to Mingun/quick-xml that referenced this issue Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted serde Issues related to mapping from Rust types to XML
Projects
None yet
3 participants