-
-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken indentation in ordered lists indented by 2 spaces #594
Comments
For ordered lists, the delimiter width is an additional character, so they need at least one more space of indentation to form a sub list. i.e. for - foo The delimiter is Whereas in 1. foo The delimiter is |
The trick is: to form a sub list you need to line the nested list up so that it is inside the original list's content (hence why the delimiter width makes a difference). |
This makes no sense at all to me as a user.
Why should I care as a user that the delimiter width is an additional character? This sounds like a technicality to me, the way I see it, and I suppose pretty much all Markdown users see it (don't you too?) I'm just increasingly indenting things by pressing tab, like I usually do in a regular programming language. I don't see why this is broken by design. Plus I think what you're saying is wrong, the width of the delimiter for ordered lists can be 2, or more, characters, as I can write
This doesn't seem to be the case at all since the following: 1. Foo
1. Bar
1. Baz
1000. Foo
1000. Bar
1000. Baz Is rendered as: So what's actually going on here? How is CommonMark expecting nested ordered lists to be written? I really think indentation should be governed solely by how many "indentation blocks" (i.e. in this scenario it would be how many sequences of 2 spaces there are), why is this not the case? Plus what editor I'm I supposed to use to write these sorts of varying indentations depending on the context? All the ones I'm aware of keep a constant indentation size per-language when tabbing and that's it. |
The parser sees the list content in the second list as Foo
1000. Bar
1000. Baz There is a rule about which says ordered list items can only interrupt paragraphs if they start at 1 to avoid accidentally starting a list, which is why the Bar line doesn't start a sublist. You can get the behaviour I think you're expecting by using a new line to break the paragraphs 1000. Foo
1000. Bar
1000. Baz The point in the current behaviour afaik is to make it easy to read the structure that you'll get by lining things up with each other. If you write 1. Foo
1. Bar then it looks like the bar list falls one space short of the contents of the foo list (since it starts one space before Foo), so this is the structure you get too. If you write 1. Foo
1. Bar then things line up, and so you get a second list contained in the first one.
This is correct, if you write a longer marker then this affects how things need to be lined up. |
There's a big explainer in the spec around motivation for this behaviour: https://spec.commonmark.org/0.29/#motivation |
Thanks for taking the time to reply to me.
That's not the behavior I want, the behavior I want is being able to write lists with a consistent indentation, whether they are ordered or unordered. I'm a bit of a whitespace lover myself, but adding a line break before every single item is a bit too much even for me.
I can see it being easier to read when talking about a short list like that, but IMHO this reasoning is fundamentally flawed for the following reasons:
This indentation mess is still not rendering properly for me though: 1000. Foo
1001. Bar
1002. Baz I get the following HTML out: <ol start="1000">
<li>Foo<br>
1001. Bar<br>
1002. Baz</li>
</ol> I believe I'm aligning those lists "as expected", why is this not working? |
Commenting the spec:
This doesn't seem to be the case to me at all, since the 2 lists mentioned in my initial post do not render properly, and indenting just by tabbing is very natural, while aligning according to how many digits there are it's not, at least as far as I, and my editor (VSCode), are concerned.
Nonsense, if something is indented less than something else it can't be a sibling nor child of the other thing, if there's a rule that goes against this then it seems very wrong to me.
I think I see the issue now, the fact that |
@aidantwoods I think the fact that the two lists I provided in my initial post aren't rendered properly is enough to convince me that there's something broken here. And how am I even supposed to write aligned ordered lists like that if I use tabs rather than spaces? Who's the person that can decide to get this fixed in a future revision of the spec? I'd like to convince him/her that the current behavior should be changed, or at the very least I'd very much like to hear his/her reasoning to why the lists I provided shouldn't actually be rendered the way me and my editor expect them to. |
If you are using the tab key to indent, some editors support inserting a flexible amount of space characters depending on language and context (smart tabs). Alternatively, the horizontal tabulator character can be parsed and displayed in a dynamic way to make local alignment more sensible; this is known as elastic tabstops and is quite similar to how |
@Crissov It doesn't look like that's supported in VSCode unfortunately (microsoft/vscode#3932), which is probably the most popular editor nowadays. |
Perhaps you should read on in the spec...
This, I take it, is the proposal you favor. So why don't you see what the spec says about it? Believe me, there was extensive thought and debate about all the possible list indentation rules when the spec was drafted. There are good reasons for the choice that was made. (If we didn't have indented code blocks, and didn't have to care as much about backwards compatibility, then the proposal of allowing any indentation under a list item to start a sublist would make a lot of sense. Indeed, see my post https://talk.commonmark.org/t/beyond-markdown/2787, under the heading "Indented code blocks and lists," which discusses precisely the case you're interested in.) |
@jgm thanks for joining in. I don't doubt that there's been plenty of thought and care into writing the CM spec, and that we are somewhat limited by backwards compatibility, but I believe here there's an issue that needs to be addressed.
More or less, but my point is more about consistency of indentation between ordered and unordered lists then anything else, I think my point could be condensed into this sentence: Unordered lists indented by 2 spaces are rendered correctly, why aren't then ordered lists indented by 2 spaces rendered correctly too? Assuming that by "indentation" we are referring to the whitespace at the beginning of the line, not including the list delimiter. Assuming that whatever restriction there might be because of unfenced code blocks it would impact both unordered and ordered lists alike. Assuming that by "correctly" we mean that the following deeply nested ordered lists are in fact recognized as deeply nested ordered lists and not something else: 1. Foo
1. Bar
1. Baz
1. Qux
Your post talks about hypothetically supporting deeply nested lists that are indented by just one space. That's not the argument that I'm making, in fact unordered lists are rendered correctly already as far as I am concerned, my point is that ordered lists follow an indentation rule that's incomplete and incompatible with the way unordered lists are indented. By "incompatible" I mean that the same indentation used for rendering deeply nested unordered lists can't be used for rendering deeply nested ordered lists. By "incomplete" I mean that I believe by adding an additional rule we could support rendering unordered and ordered lists with the same, consistent, indentation, while still supporting the current alignment rules. I think it's fine if some people prefer the current behavior, and that can't be changed anyway for backwards compatibility reasons, but why indenting unordered lists by 2 spaces works while indenting ordered lists by 2 spaces doesn't? I can't see why the example list I posted above can't be rendered properly. |
You're assuming things about what is "correct," when that is the very thing at issue here. And you're not engaging with the argument that's given in the spec itself (the part I was trying to point you to).
It's because the proposal commonmark implements makes the required indentation depend on the width of the list item marker, and the marker is longer for ordered lists than for unordered ones. The substantive question is: why do it this way rather than requiring some fixed indentation beyond the start of the last marker (e.g. >= 1 space, or >= 2 spaces). Well, that is answered in the spec itself:
The "Beyond Markdown" post talks about allowing one-space indent on bullet list items, but the same thing that is said there applies to the proposal to allow two-space indent on ordered list items. As I say in the post, the key thing blocking the proposal you favor is the proper handling of indented code blocks inside lists items. That's why we can't adopt this proposal without a lot of backwards-incompatible changes. |
If I'm understanding this correctly the indentation of our unfenced code block in the example above is measured from the beginning of "foo" because otherwise it would be considered indented by 8 spaces and that would break things. Correct? I wasn't able to find in the spec why indenting unfenced code blocks by 8 spaces would break things, but I can imagine the following scenario:
Is this basically the issue? Or am I missing something? IMHO those might not be blocking issues as there might be multiple ways to disambiguate the situation:
This seems like a tractable issue to me, an issue worth resolving because I'm pretty confident most people do just progressively indent things by pressing the tab key repeatedly, and they might not be stumbling upon this issue just because they are indenting lists with 4 spaces rather than 2. Also the writing experience is at least as important as the reading experience, and the most popular editor out there (VSCode) has no concept of "elastic tabstops", should we fall back to the spacebar for indentation and alignment? I hope the answer is no. Did I miss some other ambiguity which is totally intractable and is not a very edge case? At the moment I can't think of an unordered list with some unfenced code blocks embedded within it which I would by able to rendered correctly in my mind while at the same time switching that list from unordered to ordered would make me no longer able to render it with the same nesting levels. Can you think of any? |
I do think there is a bug in how ordered sublists with a start number are handled, as mentioned in #594 (comment)
gives as expected,
But
gives
which I think is incorrect. If we're going support ordered lists with a starting number, that should be valid for any ordered sublists |
@digitalmoksha this is really a separate issue and shouldn't be discussed here. |
@fabiospampinato One could probably pile up a bunch of ad hoc heuristics to get reasonable results much of the time, but this would make the spec even more complex and less predictable. The current setup has a clear enough mental model: remove the list marker and spaces preceding the first content, and remove an equivalent number of spaces indent from each subsequent line, then parse the result, and that's the content of the list item. I think things would really start to be too unpredictable if, e.g., parsing was different depending on whether the list item contained a code block. And certainly we don't want to make it impossible to have code blocks where the code starts with spaces. This means that users have to learn that the "overlapping style" of nested lists won't work: the nested list must be indented at least as far as the content of the enclosing list item. Admittedly this might surprise some people, but one can learn it. If you don't like variable indentation, then, as the spec notes, you can use the 4-space rule and this will always work (unless you have very 3-digit ordered list numbers, which is extremely rare). |
Two simple solutions on the authorʼs side:
Make either of these a habit. - donʼt!
- do
- do
1. do
2. donʼt
3. donʼt |
The less complex and the more predictable the better, I think we agree on that. Surely though there's a threshold between added complexity and usefulness somewhere, I think we can make the spec slightly more complex for enabling consistent lists indentations.
I would bet that the majority of the people writing Markdown don't even know about this though, I've been writing Markdown for years and I'm only now discovering this "quirk" in the spec. And of those who may stumble upon this how many are going to even know what CommanMark is and are going to propose to amend the spec? Not many I would imagine.
I think we can come up with some clear to understand and deterministic rule that works and doesn't complicate things unnecessarily.
Certainly.
Even if everybody could be educated about this, and good luck with that, I still think this required alignment is very unergonomic to write, and that kinda goes against the whole point of Markdown I guess.
Well I don't like 4 spaces indentations either 🤷♂ Maybe going back in time the 4-spaces rule could have been enforced better but unfortunately now this can't be enforced anymore for backwards compatibility. If I'm understanding your point of view correctly the heuristics I proposed would complicate the spec too much for not enough gains. What about something like this:
This seems pretty clear and deterministic to me, it should be easy for implementors of the spec to implement (and I'm willing to spend some time crafting PRs for some of the most popular implementations out there), it doesn't break how code blocks work, it maintains backwards compatibility, and it would solve the issue I'm stumbling on completely, enabling lists of all kinds to be written with a consistent indentation (which is quite of an upside IMHO). What do you think? |
It's not just a question of sublists, but of all subordinate block-level content under a list item. So what about
Would you want to treat the |
I think your point is that if sublists indented by 2 spaces are now considered as proper sublists then subparagraphs indented by 2 spaces maybe should be affected by the same rule as well? IMHO that snippet should be parsed the same way it's parsed currently, for the following reasons:
|
I'm skeptical, but if you want to propose a minimal diff of spec.txt I could take a look at it. |
What makes you skeptical about this? Is the proposed solution too complex or not elegant enough? Is the issue that it's meant to solve not much of an issue after all in your eyes? Something else...? I can absolutely try to amend the spec and submit a PR, but I'm not an expert on the current spec and I've never written any formal specification either, so you or somebody else should probably review/write the final diff. |
Fabio Spampinato <[email protected]> writes:
What makes you skeptical about this? Is the proposed solution too complex or not elegant enough? Is the issue that it's meant to solve not much of an issue after all in your eyes? Something else...?
The spec is a whole ball of wax. The idea is to specify general
principles that give the right result, not just say, "I want this
result for this case." So the hard work consists in reworking the
spec to give the result you want.
The "proposed solution" isn't yet a solution until you show how to do that.
|
I can sympathize with that, from my point of view this is more like "the spec doesn't give the right result, let's improve it" though. |
I'm going to say some things that someone has to say. It's best if it comes from an observer. If any of you think this is inappropriate or off-base, just say so and I'll delete this comment. I just saw this thread now, read it from top to bottom, and have been cringing much of the time. @fabiospampinato, with all due respect, you're not showing much due respect. I'm amazed at the seemingly infinite patience and kindness shown in all the thoughtful replies to your repeated insistence that this is wrong, that you don't like it. Obviously I am not so evolved. You say a lot of stuff in absolute terms. You haven't bothered to read the spec much. That may have been fine initially, but after the first or second "it's not that simple" replies, the onus is then on you to go read it in depth, rather than disrespect the time (again, look at all those detailed, kind replies) and work (many years and many people) that went into this thing you consider "broken by design". Your tone comes off as arrogant; there is an undercurrent of How can such a stupid decision have been made! Some examples:
Markdown is not a programming language. It is a plain text style that prioritizes human readability. Not data entry convenience. Read the intro of the spec. Frankly, that the content of a list item must be left aligned is much more readable. Each list, parent and nested, in your examples above has a single item each with only a single word of content. Try making nested lists with multiple paragraphs, embedded block quotes and nested lists.
Then it can't be that important, can it? I tried it in Babelmark2 and only one does it your way. Which brings me to my last observation: You also come off a bit self-centered, that if CommonMark doesn't work the way you expect it to work, it is "broken by design". I absolutely love the way complex list items are supported. You say "IMHO" a lot. As Inigo Montoya said in The Princess Bride: "You keep using that word. I do not think it means what you think it means." |
I apologize if I've come off as arrogant or disrespectful of anybody's time, work and patience in this thread. I'm thankful of CommonMark's mere existence and of all the time and care that has been put into this project, I wouldn't even have bothered opening this issue in the first place if I didn't believe CommonMark is important and that it'd be best to have this "fixed" via the spec.
From my perspective I'm just expressing my particular opinion on the matter, not demanding that my opinion is the "correct" one, whatever "correct" even means in this context. If I say "This makes no sense at all to me." I'm saying it only because that's actually what I'm thinking, it doesn't mean that it shouldn't make any sense to everybody else unless they are stupid or something. I prefer to write that instead of things like "I'm having troubles understanding the reasoning that went behind this particular decision in the spec" or something more formal like that. I apologize if this kind of wording offended anyone.
I don't think that's a fair statement, I believe I've been prompted to read the spec here initially, which I did immediately e commented subsequently here, at that point I could have continued reading more, but I stopped once I thought I had found the root cause of the current behavior, that is: the spec considers the list delimiter as part of the indentation. When prompted again to read again a particular section of the spec, here, I did that and commented on that. That last section of the spec mentioned: "And this would break a lot of existing Markdown, which has the pattern: (...) where the code is indented eight spaces.", which if I'm understanding this correctly is basically the root cause of the current behavior. It wasn't clear to me why code blocks indented by 8 spaces would break things, and I don't believe this is mentioned explicitly in the spec, so I commented on that trying to understand what the potential issue could be. Should I have read the spec even before starting the discussion? Maybe, I don't think it'd be unfair to ask that, I decided not to do that and submitting an amendment of the spec immediately because I thought this would have been a 2-posts-long thread ending either with: "This is impossibile to implement because it would necessarily break this use case" or "This is too big of a change and we won't do it". And frankly even if I had read the spec immediately I would have probably stil asked for clarifications because the reasoning behind why 2-spaces indented ordered lists would break things would have probably still not been clear to me.
From my point of view I've just been trying to understand why the spec works the way it does, I expressed my feelings about particular decisions (Not that it actually matters how I feel about any of this, but as a developer with some open-source projects myself I think it's very often valuable to hear some feedback from your users) and I loosely proposed (I just spit some initial ideas, not a detailed diff of the spec) some potential ideas for improving (from my point of view at least) the spec.
I know, I only said that in regular programming languages (and by "regular" I'm referring to the popular ones, I'm not saying that Markdown is a weird programming language or something, because there are weird toy ones that use only whitespace characters basically. In hindsight this statement was probably confusing, but also explicitly saying that I didn't consider Markdown a programming language might have come off in a negative light too. It was just a poor choice of words on my end, sorry.) we often indent things by pressing the tab key, and supporting indenting ordered lists the same way in Markdown too perhaps should be supported as well, even if by doing that we insert 2 spaces rather than 4, especially considering that unordered lists indented that way are effectively supported.
The very first paragraph of the introduction says:
Emphasis mine.
That sounds like an absolute statement. Whatever our opinions might be about this (I also think they can be more readable, but with the greater con of being more difficult to write) it doesn't really matter as far as the spec is concerned, that can't be changed for backwards compatibility and I'm not proposing to stopping supporting that in any way.
That's an interesting observation, I think it depends, maybe people just don't write deep ordered lists that often, maybe they think that Markdown actually doesn't support 2-spaces indentations under any circumstances and just go for 4 spaces all the time, maybe they've gotten used to indenting things just one more time if they don't immediately render properly.
That's actually good news, it means that most Markdown compilers are already CommonMark-compliant in this specific regard. I'm all for standardization. Also it's perhaps a bit off-topic but I think there are many developers that prefer 2-spaces indentations, it's not just my way.
Well, it feels broken to me, but I don't think the spec authors someday thought to themselves "why don't we break the spec?". That was probably a poorly worded sentiment, sorry.
And that's perfectly fine, that will always remain supported anyway for backwards compatibility reasons.
I'm not sure I'm understanding what your point is here, I think it means "In My Humble Opinion" and I write it to signify that what follows is just one guy on the internet's opinion. I'm thankful to anybody involved with writing the CommonMark spec, I'm thankful that your comments to this thread have all felt kind to me, I'm thankful that somebody commented on a random guy's question in an open-source project, I'm sorry if/when my comments haven't felt as kind, I meant no disrespect to anyone. In the next few days/weeks I'll follow up with my proposed change to the spec, my only goal here is to make Markdown, as defined by this spec, more approachable and easier to write without breaking anybody's use cases or making the spec unreasonably more complex. |
Ok thank you! I'm sorry if I came off a little harsh; Thank you for taking it the right way. Please internalize the intro to the spec. I feel a lot of the ideas for new CommonMark features fail to incorporate or even recognize the "overriding design goal" so clearly stated there. By chance, the intro illustrates its point with a complex nested list structure, apropos to your endeavor. Welcome to the CommonMark community. Good luck! |
Btw this is the very first issue mentioned by GitLab in their Markdown guide regarding their switch to CommonMark: https://docs.gitlab.com/ee/user/markdown.html#transition-from-redcarpet-to-commonmark |
@jgm why did you close this? I unfortunately haven't had the time to submit a detailed diff of the spec about this issue yet. |
Well, feel free to submit a diff in a new issue or PR if you ever do get around to it. |
It's been more than a year since I opened this issue and I haven't submitted a PR yet so I guess it may never happen especially considering that the time required to pre-parse Markdown documents on my end to fix this issue would probably be less than the time required to submit a PR for this, the latter of which may not lead to anything anyway. I'm still very much interested in addressing this though, if any maintainers changed their minds on this it'd be great to see this issue opened again. |
Another quirk related to this, I'm seeing the following in my editor: At first look that just looks broken. The problem is that one reasoning behind indentation rules for lists is that sublists should be aligned "properly" with the parent list, and indenting with an extra tab is considered ok, the thing is tabs by definition don't have a set width, their width once rendered is set by the user, so even if the compiler assumes a monospace font is being used it can't possibly check that things are visually properly nested, still potentially according to the spec 1 tab rendered with the width of 1 or 2 spaces is ok, but indenting with 1 or 2 spaces is not ok. |
The spec specifies that tabs are treated as if there is a 4 character tab width. Set your editor to behave this way and you should be fine. |
Right, but personally I prefer 2-spaces tabs. The app I used for that screenshot doesn't even allow me to customize this. Even 4-spaces tabs would break eventually, when the number used in ordered lists gets high enough the compiler at some point should require 5 spaces, I think, in which case I guess 1 tab and 1 space would be what the spec expects to find, to maintain alignment. |
I'm rendering the following code via markdown-it, which should be CommonMark compliant:
And this is the result:
IMHO if unordered lists indented with 2 spaces are supported than ordered lists indented with 2 spaces should be supported too, right?
The text was updated successfully, but these errors were encountered: