-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement passthrough parser #2
Conversation
Great start! I've got 26 test cases. All of these pass when using a shortcode1, but 10 of them fail when using the passthrough approach. test cases added to passthrough/passthrough_test.go
I've added To me, the gold standard for testing is the method that Hugo users have been using for years: wrap the LaTeX in a shortcode that passes So I think we need all 26 test cases to pass. I've update the test repo, splitting the passthrough and shortcode tests into separate pages. Each page performs the same tests, in the same sequence, with the same name... so it's easy to compare results. At some point we'll need some false positive tests as well, if for no other reason than to provide guidance to content authors (i.e., what not to do, when to escape delimiters, etc.). Footnotes
|
As to the failing CI build, we check that the |
Sorry for the delays, inclement weather and all. I have some simplifications and fixes incoming that handles all of @jmooring's tests, with the main remaining obstacle being that goldmark's "raw text" renderer still does character substitutions like |
All the tests are now passing, with one caveat. I was struggling to get a proper inline and block parser working separately with some of the test cases (e.g., Example6 would trigger a new block on the third line, and I couldn't get goldmark to treat it as a continuation of the existing block). My workaround was to implement the entire extension using only inline parsing/rendering. As a result:
|
If I understand this correctly, it will prohibit us from creating future, separate render hooks for inline and block pass-through as mentioned here. The inline folks will say, "I want to be able to wrap this with a span so I can mouseover...". The block folks will say, "I want all blocks to be wrapped in a figure element." So, we'd need separate render hooks.
Can you briefly describe this? |
I don't recall seeing this contention anywhere, but one possibility would be to add a configuration option to render the passthrough with any user-configured fences (e.g., a hugo shortcode), which could then be transformed by other hugo mechanisms (sorry, not super familiar with what that might be; are shortcodes processed before or after goldmark?). That option could be specific to each delimiter pair, so block delimiters could get a different rendered fence than inline ones. In that direction, would the hugo team prefer for the render hooks to be hugo-based or goldmark-based? But I can see the issue that if these are nested within a paragraph tag, it might be tricky to lift out. I could give the block+inline approach another go, but another problem was that I couldn't even get the block parser to trigger in the case that the block delimiters were used inline. From what I recall, there was something weird about how goldmark handles lines starting with punctuation in re separating blocks, and I couldn't quite get to the bottom of it.
Instead of having two configuration options, one to specify block delimiters and one to specify inline delimiters, there is just one option to configure all delimiters, and they are all implicitly inline. |
One other test I just thought of was to handle |
This is a Hugo thing, not a Goldmark thing. We currently have render hooks for images and links (both are inline elements), and headings and fenced code blocks (both are block elements). But the Goldmark extension should be structured such that, in the future, we can easily hook into both inline pass-throughs and block pass-throughs. Shortcodes are not relevant to the either the Goldmark extension or render hooks.
I think this is consistent with the Hugo configuration described here:
Not sure I understand what you mean by this.
OK, but then we're not testing against what users currently do/get (shortcode). |
I don't think it's possible to separate the block/inline parsers and have a single set of configured delimiters. You have to separate them to say which to trigger parsing a block and which to trigger parsing inline, unless the two sets are identical.
I think you're saying one would (during hugo's configuration of the goldmark parser) override the renderer I wrote in this PR, and use some other renderer for the parsed nodes, right? As is, such a renderer would emit HTML that is nested inside a paragraph tag. Am I understanding you that this is not acceptable? I could also attach the parsed delimiters to the ast node, so that the behavior can branch based on the matched delimiters. In any case, a lot of this nuance focuses on how |
OK, I think I get this now. The original (my) configuration model was inadequate because it would require delimiter hardcoding in the extension. And we don't want to hardcode to LaTeX delims; this is a generic passthrough extension. (e.g., "I want to do inline passthrough of text that starts with "xxx" and ends with "yyy"). So, the proposed Hugo config should have been:
or maybe this is better (we can figure this out on the Hugo side later)
|
I would, for the following reasons:
Regarding the first reason... when this is released content authors might ask, "Can I remove all the If you haven't used them, I recommend trying Hugo's render hooks to see how they behave. What they render completely replaces the markdown without wrapping them in something else. That's true for both inline and block hooks. |
Are there any standard markdown syntax forms that allow an entity specified inline to split into a new block? Code fences do not: https://spec.commonmark.org/dingus/?text=Inline%20%60%60%60code%60%60%60%20equation I worry this may be a limitation of goldmark, though I would have to pinpoint the specific reason. If the same fences render differently in the two contexts for other markdown concepts, it might make more sense to handle the two cases separately, and only emit a passthrough block if the delimiters wrap the content on separate lines, as in the case of code fences. |
No.
This would not be compatible with: And the last two are about portability, the argument/justification that turned this effort from a proposal into an accepted enhancement. |
In your example, GitHub renders the block equations inside a I don't have a GitLab account so I can't check if that example also fails on GitLab, but based on your inline examples on GitLab it likely fails. I think we're going to have to compromise on something here. Which of the block equation examples are we willing to give up? I think it would be important to keep:
and
But not support split-line block equations happening inline.
I think I could make that work with a block parser, but if the priority is, "Can I remove all the |
OK, we can document that split-line block equations are not supported, as long as the other two block forms work.
Yes, that one. Question: will inline support all three forms (same line, separate line, split line)? |
Inline supports everything, that's what I pushed to the PR. I will give it
a go to see if I can get block parsing support for the non-split-line block
forms.
…On Thu, Jan 18, 2024, 4:55 PM Joe Mooring ***@***.***> wrote:
I think we're going to have to compromise on something here... I think I
could make that work with a block parser.
OK, we can document that split-line block equations are not supported, as
long as the other two block forms work.
If the priority is to support stricter consistency in future potential
overrides to the passthrough renderer
Yes, that one.
Question: will inline support all three forms (same line, separate line,
split line)?
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAS2PKX4EAXLAOYS7EJO44TYPHABLAVCNFSM6AAAAABB2ZITMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJZGQ3DCMBQGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Good news, after quite a bit more digging I found a workable solution to parse all desired syntax forms and support block delimiters expressed inline with a block renderer. The solution was to use Goldmark's concept of an However, this does change some of the test cases, since if we're properly emitting a block node, then we have to have a paragraph break in the rendered HTML. I.e., in the tests that @jmooring defined like
The result is
Which differs from the originally written expected output of
It also made me realize that some strange edge cases, such as using a block delimiter inside a list item, should probably remain as inline nodes. I still have to implement the escape idea described in #2 (comment), but in the meantime please let me know if the above tweak to the test cases is acceptable. |
Looking good! First, a couple of inconsequential edits to the test cases... I missed a line when creating example 2, and the last example has some leading spaces where it shouldn't. diffdiff --git a/passthrough/passthrough_test.go b/passthrough/passthrough_test.go
index 2493f12..b11a5ec 100644
--- a/passthrough/passthrough_test.go
+++ b/passthrough/passthrough_test.go
@@ -187,8 +187,9 @@ func TestExample1(t *testing.T) {
}
func TestExample2(t *testing.T) {
- input := `x = {-b \pm \sqrt{b^2-4ac} \over 2a}
-n$ equation`
+ input := `Inline $
+x = {-b \pm \sqrt{b^2-4ac} \over 2a}
+$ equation`
expected := "<p>" + input + "</p>"
actual := Parse(t, input)
@@ -482,22 +483,22 @@ $$`
func TestExample26(t *testing.T) {
input := `\[
- \begin{array} {lcl}
- L(p,w_i) &=& \dfrac{1}{N}\Sigma_{i=1}^N(\underbrace{f_r(x_2
- \rightarrow x_1
- \rightarrow x_0)G(x_1
- \longleftrightarrow x_2)f_r(x_3
- \rightarrow x_2
- \rightarrow x_1)}_{sample\, radiance\, evaluation\, in\, stage2}
- \\\\\\ &=&
- \prod_{i=3}^{k-1}(\underbrace{\dfrac{f_r(x_{i+1}
- \rightarrow x_i
- \rightarrow x_{i-1})G(x_i
- \longleftrightarrow x_{i-1})}{p_a(x_{i-1})}}_{stored\,in\,vertex\, during\,light\, path\, tracing\, in\, stage1})\dfrac{G(x_k
- \longleftrightarrow x_{k-1})L_e(x_k
- \rightarrow x_{k-1})}{p_a(x_{k-1})p_a(x_k)})
- \end{array}
- \]`
+\begin{array} {lcl}
+L(p,w_i) &=& \dfrac{1}{N}\Sigma_{i=1}^N(\underbrace{f_r(x_2
+\rightarrow x_1
+\rightarrow x_0)G(x_1
+\longleftrightarrow x_2)f_r(x_3
+\rightarrow x_2
+\rightarrow x_1)}_{sample\, radiance\, evaluation\, in\, stage2}
+\\\\\\ &=&
+\prod_{i=3}^{k-1}(\underbrace{\dfrac{f_r(x_{i+1}
+\rightarrow x_i
+\rightarrow x_{i-1})G(x_i
+\longleftrightarrow x_{i-1})}{p_a(x_{i-1})}}_{stored\,in\,vertex\, during\,light\, path\, tracing\, in\, stage1})\dfrac{G(x_k
+\longleftrightarrow x_{k-1})L_e(x_k
+\rightarrow x_{k-1})}{p_a(x_{k-1})p_a(x_k)})
+\end{array}
+\]`
expected := input
actual := Parse(t, input)
Second, I don't think we're going to know the impact, if any, of splitting one paragraph into two paragraphs until we get this into the field. I suspect it will be fine. Third, can you re-purpose gohugoio/hugo#11866 for the Hugo integration bit? I think we should defer asking for bep's review until both pieces are working. Again, this looks great! |
golang newbie question: how should I test it if the test code is in my repository?
I could update gohugoio/hugo#11866 to use my forks in their |
gohugoio/hugo#11866 is working, pointing to my local extension in b5ea64e10 Note:
|
Another test to add:
|
I am finding some issues with the integration, will move the discussion to gohugoio/hugo#11866 |
@jmooring As an practical note from the side: Assuming you got something that works OK (and the tests are green in this repo), I wouldn't mind just merging this PR into the main branch. This should make integration testing with Hugo a lot easier. This repo is unversioned and in progress, so people should know that it's not done yet. |
Implementation is ready for review with one caveat: I can't get goldmark to trigger the block parser for inputs where the block fences are placed inline:
Despite having
CanInterruptParagraph
to true in the parser, the block parser is never triggered and I'm not sure why.One workaround for LaTeX is to duplicate all the configured block fences in the config for the inline fences, which actually seems like the more appropriate general behavior: a configured block fence only triggers when it's the start of a new markdown paragraph.
Other than that, all the tests in
https://github.com/jmooring/hugo-testing/blob/hugo-github-issue-10894
seem to pass.