Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line breaks between list items are not parsed correctly #214

Closed
1 task done
ealymbaev opened this issue Jun 8, 2020 · 11 comments · Fixed by #215
Closed
1 task done

Line breaks between list items are not parsed correctly #214

ealymbaev opened this issue Jun 8, 2020 · 11 comments · Fixed by #215

Comments

@ealymbaev
Copy link
Collaborator

Please help prevent duplicate issues before submitting a new one:

  • I've searched other open/closed issues for duplicates before opening up this new issue.

Report

What did you do?

I am trying to parse the following markdown example:

# First Example

- first item
- second item

# Second Example

- first item

- second item

What did you expect to happen?

Most markdown renderers (e.g. Github) insert a line break between list items in Second Example.

What happened instead?

But Down library does parse both first and second example same way. Here is the output of DebugVisitor:

Document
    ↳ Heading - L1
        ↳ Text - First Example
    ↳ List - type: Bullet
        ↳ Item
            ↳ Paragraph
                ↳ Text - first item
        ↳ Item
            ↳ Paragraph
                ↳ Text - second item
    ↳ Heading - L1
        ↳ Text - Second Example
    ↳ List - type: Bullet
        ↳ Item
            ↳ Paragraph
                ↳ Text - first item
        ↳ Item
            ↳ Paragraph
                ↳ Text - second item

I do understand that you use cmark library for parsing markdown nodes, but maybe I am missing something? Maybe some configuration options?

Or should I report this issue to cmark library repo?

@ealymbaev
Copy link
Collaborator Author

@johnxnguyen @rob-keepsafe any thoughts?

@johnxnguyen
Copy link
Owner

Hi @ealymbaev sorry for the delay. Indeed this is the behavior of cmark, whereby two list items separated by a single line break are considered to be part of the same list. I believe this is intentional and part of the commonmark spec.

Note however that two list items are separated by double newlines, then they both belong to separate lists:

1. I belong to list one.


1. I belong to list two

Does this help?

@ealymbaev
Copy link
Collaborator Author

Hi @johnxnguyen here is the quote from commonmark spec you provided:

If list items are separated by blank lines, Markdown will wrap the items in <p> tags in the HTML output. For example, this input:

*   Bird
*   Magic
will turn into:

<ul>
<li>Bird</li>
<li>Magic</li>
</ul>
But this:

*   Bird

*   Magic
will turn into:

<ul>
<li><p>Bird</p></li>
<li><p>Magic</p></li>
</ul>

As you may see, list items separated by blank line are parsed differently (wrapped with paragraph).

But cmark does not differentiate these cases and parse them into exactly the same tree. That is the main issue I am facing now.

P.S. I do not need to separate them into different lists, I need both items in the same list, but I need to know if they have (or don't have) blank line between them in source markdown.

@ealymbaev
Copy link
Collaborator Author

ealymbaev commented Jun 10, 2020

Here is an example showing how Github renders these cases.

# First Example

- first item
- second item

# Second Example

- first item

- second item

renders into:

First Example

  • first item
  • second item

Second Example

  • first item

  • second item

@johnxnguyen
Copy link
Owner

johnxnguyen commented Jun 10, 2020

P.S. I do not need to separate them into different lists, I need both items in the same list, but I need to know if they have (or don't have) blank line between them in source markdown.

Sorry, I misread the issue!

I think I understand now what's the difference between the two rendered examples. Playing with this example I see that the tree has the same node structure regardless of whether there is a newline between the list items. Both list items contain a paragraph node (just as the output of Down shows). However, there is a hidden property on the paragraph node of the first list item which is true if there is no newline and false if there is a newline.

So the issue isn't about cmark producing the same tree in the two different examples, but the lack of additional information about the (paragraph) nodes in the tree.

It leads me to think that this hidden property is used when rendering, for example, by wrapping the text in a <p> tag or, in the case of NSAttributedString, adjusting the paragraph spacing attribute.

I'd need to dig a little deeper to check if cmark exposes such a property on the paragraph node. If it does, then indeed we can fix the rendering within Down.

@ealymbaev
Copy link
Collaborator Author

Yes, that was exactly what I was talking about.

It would be great if you could add additional property to List node that indicates this case.

I don't know if it would help, but I have found tight property on cmark_list struct. Maybe this property does indicate this?

Here is the link to cmark man associated with this property: https://github.com/commonmark/cmark/blob/master/man/man3/cmark.3#L410

@johnxnguyen
Copy link
Owner

Hey @ealymbaev , you're right, the tight property is what we're looking for. I can add it as a property to List. Are you using the AttributedStringVisitor to render your markdown? Or are you using a custom visitor?

@ealymbaev
Copy link
Collaborator Author

@johnxnguyen actually I am using my own visitor, but it has its own AttributedStringVisitor and uses it for handling low-level nodes, like text, emphasis, link and so on. In other words I am using my visitor to form my own blocks, but the text inside blocks are parsed to attributed strings via AttributedStringVisitor.

So it would be enough to have this property in List node. I am interacting with nodes directly in my custom visitor.

@ealymbaev
Copy link
Collaborator Author

ealymbaev commented Jun 11, 2020

I have tested your pull request and it works!

Document
    ↳ Heading - L1
        ↳ Text - First Example
    ↳ List - type: Bullet, isTight: true
        ↳ Item
            ↳ Paragraph
                ↳ Text - first item
        ↳ Item
            ↳ Paragraph
                ↳ Text - second item
    ↳ Heading - L1
        ↳ Text - Second Example
    ↳ List - type: Bullet, isTight: false
        ↳ Item
            ↳ Paragraph
                ↳ Text - first item
        ↳ Item
            ↳ Paragraph
                ↳ Text - second item

Thank you for the feature!

@ealymbaev
Copy link
Collaborator Author

When are you planning to release new version with this feature?

@iwasrobbed
Copy link
Collaborator

@ealymbaev I'll release once the PR merges

Also I've given you collaborator access so you can feel free to open PRs and help improve Down in the future

Thanks for raising the issue and digging into the docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants