Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mishandling of spaces in HtmlNode.ToString ? #1509

Open
njlr opened this issue Jun 28, 2024 · 3 comments
Open

Mishandling of spaces in HtmlNode.ToString ? #1509

njlr opened this issue Jun 28, 2024 · 3 comments
Labels

Comments

@njlr
Copy link
Contributor

njlr commented Jun 28, 2024

I have some HTML that Firefox renders like this:

Screenshot 2024-06-28 at 19-33-26 njlr

If I "round-trip" this in FSharp.Data, then the output renders like this:

Screenshot 2024-06-28 at 19-33-14

Here is the HTML:

<pre class="shiki vitesse-light" style="background-color:#ffffff;color:#393a34" tabindex="0"><code><span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> input</span><span style="color:#1E754F"> =</span><span style="color:#B5695999"> "</span><span style="color:#B56959">123</span><span style="color:#B5695999">"</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> intOfDigit </span><span style="color:#1E754F">(</span><span style="color:#B07D48">x </span><span style="color:#1E754F">:</span><span style="color:#2E8F82"> char</span><span style="color:#1E754F">)</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34">  int x </span><span style="color:#1E754F">-</span><span style="color:#393A34"> int </span><span style="color:#B56959">'0'</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> number</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34">  input</span></span>
<span class="line"><span style="color:#1E754F">  |></span><span style="color:#393A34"> Seq.fold</span></span>
<span class="line"><span style="color:#1E754F">    (fun</span><span style="color:#B07D48"> state next </span><span style="color:#1E754F">-></span><span style="color:#393A34"> state </span><span style="color:#1E754F">*</span><span style="color:#2F798A"> 10</span><span style="color:#1E754F"> +</span><span style="color:#393A34"> intOfDigit next</span><span style="color:#1E754F">)</span></span>
<span class="line"><span style="color:#2F798A">    0</span></span>
<span class="line"></span>
<span class="line"><span style="color:#393A34">printfn $</span><span style="color:#B5695999">"</span><span style="color:#1E754F">%i</span><span style="color:#B56959">{number}</span><span style="color:#B5695999">"</span><span style="color:#A0ADA0"> // 123</span></span>
<span class="line"></span></code></pre>

Here is a repro script:

#r "nuget: FSharp.Data, 6.4.0"

open FSharp.Data

let inputHtml = """<pre class="shiki vitesse-light" style="background-color:#ffffff;color:#393a34" tabindex="0"><code><span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> input</span><span style="color:#1E754F"> =</span><span style="color:#B5695999"> "</span><span style="color:#B56959">123</span><span style="color:#B5695999">"</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> intOfDigit </span><span style="color:#1E754F">(</span><span style="color:#B07D48">x </span><span style="color:#1E754F">:</span><span style="color:#2E8F82"> char</span><span style="color:#1E754F">)</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34">  int x </span><span style="color:#1E754F">-</span><span style="color:#393A34"> int </span><span style="color:#B56959">'0'</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> number</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34">  input</span></span>
<span class="line"><span style="color:#1E754F">  |></span><span style="color:#393A34"> Seq.fold</span></span>
<span class="line"><span style="color:#1E754F">    (fun</span><span style="color:#B07D48"> state next </span><span style="color:#1E754F">-></span><span style="color:#393A34"> state </span><span style="color:#1E754F">*</span><span style="color:#2F798A"> 10</span><span style="color:#1E754F"> +</span><span style="color:#393A34"> intOfDigit next</span><span style="color:#1E754F">)</span></span>
<span class="line"><span style="color:#2F798A">    0</span></span>
<span class="line"></span>
<span class="line"><span style="color:#393A34">printfn $</span><span style="color:#B5695999">"</span><span style="color:#1E754F">%i</span><span style="color:#B56959">{number}</span><span style="color:#B5695999">"</span><span style="color:#A0ADA0"> // 123</span></span>
<span class="line"></span></code></pre>
"""

let node = HtmlNode.Parse(inputHtml) |> List.exactlyOne

let outputHtml = node.ToString()

printfn "%s" outputHtml

Maybe I have missed something?

@cartermp
Copy link
Collaborator

cartermp commented Jul 1, 2024

Nope, seems like a bug to me. Might be worth adding a test case and seeing how this function fares: https://github.com/fsprojects/FSharp.Data/blob/main/src/FSharp.Data.Html.Core/HtmlNode.fs#L115-L174

@njlr
Copy link
Contributor Author

njlr commented Jul 1, 2024

I have created a test-case and made a potential fix here: #1510

However, I'm not sure if the logic is correct for all cases - are pre tags special in HTML?

@cartermp
Copy link
Collaborator

cartermp commented Jul 2, 2024

Yeah, they're meant to preserve whatever formatting is within them (non-html syntax). In this case we're not respecting that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants