-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document all token types #2849
Comments
https://github.com/PrismJS/prism/blob/master/components/prism-python.js#L54 What am I missing or not understanding about that statement? Looks like And this is a great idea, BTW. |
Oops. Wrong language. I meant SQL. Thanks for pointing that out! |
SQL has a https://github.com/PrismJS/prism/blob/master/components/prism-sql.js#L19 |
Yes, that too. My main intention was to enable themes to decide whether operator keywords are to be highlighted as keywords or operators. Right now, Prism languages make that decision (and that pretty inconsistently as you pointed out) by assigning them one type (either |
How would you encode that to HTML then? Nested tags? I dislike that because it makes Is that a bug? A feature? Basically it seems nested tags makes everything a lot more error prone. At least that was my thinking. We're going thru this same thing right now and I think I've decided against nesting and instead going to flatten scopes to either Ref: highlightjs/highlight.js#2521 highlightjs/highlight.js#2500 |
In my idea world the top scope is general and the lower scopes are specific. So I'd expect |
<span class="token operator keyword">NOT</span> With
No. I don't want <span class="token operator"><span class="token keyword">NOT</span></span> It's a lot harder to make specific styles like this and both the generated HTML and the Prism languages definitions generating the HTML will become unnecessarily bloated.
I read through it and I think I'm aiming for the same thing. I like the "labels" approach (option 2). I.e.
However, labels force themes to "resolve conflicts". If more than one rule applies for a token, the theme has to decide how these rules interact. This is good because it gives themes all the freedom but also all the responsibility. This means that theme authors have to be careful because every rule might interact with every other rule. This won't be a problem if the theme only set I think that this is the better approach: It's easy to implement, extendible, and clearly separates the concerns of tokenization (= understanding what code means) and styling (translating code meaning into colors). |
Yeah I think I understood, I was just asking for clarification to make 100% sure.
Sadly I always have to keep in mind 100 existing themes that I'd really like to hand edit as little as possible. :-)
We're still talking different things then. What I was proposing was:
Lets switch from SQL to a more common pattern that I think illustrate the different better.
We've traditionally render this as:
Which has been targeted with CSS like
And now I'm seeing my own point disappear before my eyes... I guess Render is technically both a
This is more how I think about scopes (TextMate style)...
The sub scope refines the main scope - it is not a full scope in it's own right. While Your approach would seem to make conflicts of sub scopes and top-level scopes more likely, no? Thoughts? |
Let me start by saying that there is a problem with "sub-scopes" in textmate scopes: you have to impose a hierarchy. Sub scopes necessitate a hierarchy because scopes are an ordered list of names. This a problem as there may not be a hierarchy between sub scopes. E.g. Let's say we have a Rust doc comment (e.g. The problem is that scopes are a list but the logical hierarchy is at least a tree. The tree for the Rust example above is:
But a tree of names is 1) a lot more complex and 2) translates even worse to CSS class names than scopes. Back to the label approach:
I see what you mean. A style rule for just However, is that really a problem? It doesn't make sense, so why would anyone do it? I mean you can have the same nonsense in textmate too, right? Isn't just
Yes, there will be a bunch of conflicts but we will use CSS to take care of them. Let's look at a .token.keyword { color: blue; }
.token.operator { color: red; } Q: How does CSS resolve this conflict? A: CSS uses rule order. If two rules in the same stylesheet match the same element and have the same specificity, the rule defined last (top to bottom order) will be used. In this example, the element will get the color red. Adding more specific rules is easy as well. .token.keyword { color: blue; }
.token.operator { color: red; }
.token.operator.keyword { color: yellow; } /* works as expected */ To make theme authors aware that rule order is important, I suggest sorting rules by ascending specificity (as seen in the example above). Within a specificity level, rule order matters so authors have to decide which rules override which by bringing them into the right order. Thoughts? |
You're persuading me a bit, but I'll answer your points and see if you find anything compelling. :)
For Highlight.js I'm not much interested in more than 2 scopes... so this isn't a huge problem for us I don't think (juggling 3-4 scopes and ordering). We've gotten by with a single scope (plus nesting that's typically 1 level deep) for 10+ years so it seems like 2 scopes should be more than enough to get the job done. So that simplifies matters considerably. And for backwards compatibility we're probably adding to our existing scopes... so in many cases we already know which scope comes first... it's only a matter of adding some specificity... is it just a But to actually answer your question there is 10 years of prior art with TextMate grammars in the real world... someone else has likely already solved this ordering problem, no? I suppose you might hate their answer, but I'm sure you could find it and adopt it - I doubt there is a need to reinvent the wheel here. So I'm not sure I would agree that that is a large or unsolvable problem - since TM clearly solved it 10 year ago. :-)
I dunno, lol... I mean anything CAN be a scope in TextMate... it's just up to styles to colorize them... and I'd say looking at the most popular styles would give you some idea of what the "canonical" scopes are. It's not listed on the page that I linked you to. (other than Your CSS examples do look pretty though. :) Let me skim thru the TM document again. |
Would working together to come up with a set of common scopes across the two different engines make any sense at all or does that defeat the purpose of having different engines? :-) |
Don't worry, I don't hate TM scopes. Especially their scope selectors elegantly solve a lot of problems. They aren't perfect (see example above) but they are very good. The main problem I have with TM is that scope selectors just don't translate well into CSS selectors. VSCodeThat being said, I came across this blog post from the VSCode dev team. It describes how they transitioned to their current TM-based highlighter and what they had before. The interesting part is that they used to use the label approach. E.g. the following TM scopes
were turned into the following HTML <span class="token meta function js definition punctuation block">{</span> They summarized this approach for TM themes as follows:
Basically, the label approach is mostly incompatible with TM themes. This is something to keep in mind and is relevant for #2848. ApproachesWhat follows is a list of discussed approaches and how they relate the TM scopes (assuming no parent scopes). I also added a hybrid approach into the mix. All approaches assume that we don't know anything about the CSS theme at runtime. (I will use TM scope approachA single TM scope can be expressed as CSS classes like this:
<span class="a a_b a_b_c a_b_c_d">{</span> /* Rule order matters because they all have the same specificity */
.a { color: red; }
.a_b { color: blue; }
.a_b_c { color: green; }
.a_b_c_d { color: yellow; } Advatanges:
Disadvantages:
This is what you outlined in highlightjs/highlight.js#2521, right @joshgoebel? Label approach(I only list this here for completeness) The label approach sees scopes as a set of string (TS definition:
<span class="a b c d">{</span> /* Rule order doesn't matters in this example */
.a { color: red; }
.a.b { color: blue; }
.a.b.c { color: green; }
.a.b.c.d { color: yellow; } Advatanges:
Disadvantages:
Hybrid approachThinking about how Prism does things again, maybe we could use a hybrid approach? Instead of saying that a scope is a list of names, how about we say: A scope is a two-element tuple where the first element is the main scope and the second element is a set of sub scopes. (TS definition: CSS classes will be constructed like this:
<span class="a _b _c _d">{</span> /* Rule order doesn't matters in this example */
.a { color: red; }
.a._b { color: blue; }
.a._b._c { color: green; }
.a._b._c._d { color: yellow; } ( Advatanges:
Disadvantages:
ConflictsConflicts occur if the selector of two CSS rules matches the same element. All approaches will cause more or less rule conflicts. It important to note that all conflicts can be resolved using CSS specificity and rule order. However, this will likely require a lot of care by the theme author. Is there anything I missed? Are there any other approaches that might be interesting? I'd be interested to hear your thoughts @joshgoebel. |
I thought of yet another approach. It's equivalent to the TM scope approach but eliminates some of its problems. TM scope prefix approach
<span class="a _b __c ___d">{</span> /* Rule order doesn't matters in this example */
.a { color: red; }
.a._b { color: blue; }
.a._b.__c { color: green; }
.a._b.__c.___d { color: yellow; } The trick is to encode the position of each name using a prefix (e.g. no prefix = index 0, one prefix char = index 1, two prefix chars = index 2, and so on). This basically takes care of the differences between TM scope selectors and CSS selectors as far as I can see. Repeating the same character might seem a bit wasteful but it should be fine in practice. Scope names aren't single letter names after all. Advantages:
Disadvantages:
Apart from the limitations of TM scopes themselves, I think this accurately translates TM scopes into CSS classes. Keep in mind that this doesn't cover parent scopes. They are still a problem, I think. |
Oh wow you've put so much thought into this. Let me come along at a bit of a higher level and see if I can add anything useful. TM scope approachI just realized I have experience with this. This is exactly what Pastie did ~15 years ago using TextMate grammars on the server-side. There was a nice TM plugin that converted themes to CSS and it did so using this strategy. A simple case:
This works well for "multi-scope" matchers as well: Something can be scooped as BOTH a
And TextMate even had a plugin that pasted to us raw HTML... and in those cases it would just upload an HTML payload (gzipped)... so Pastie could even render content for grammars that we knew nothing about - because it was just rendered a cached copy of the parsing that TM had already done on the client-side... (yet we always used our CSS) The theme fidelity of this approach was always 100% in my recollection.
Is this truly a problem though? This content is computer generated. Now 15 years ago because Pastie streamed HTML "over the wire" I did add a single filter for
But still the TM generated content still included those things and I don't recall it every being THAT much of a problem... TM scope prefix approachThis would seem to be CLOSE to the former, but slightly worse to me. And (see above) I'm not sure the "space saving" is a benefit that matters. It adds a degree of fuzz or ambiguity with regard to sub scopes. Label approachI've implemented this for Highlight.js (changed a few lines) and it's not terrible (though I have very little experience with it so far an only using it for a few things). I had to reorder some of our CSS (for legacy issues) but I was trying to rework our docs when I realized how this can get muddled. We have an example:
I was trying to explain that while But then I thought why not
OR
Hybrid approachI don't see a huge technical distinction here between labels... I realized that while I was implementing labels I was almost thinking of them in my head more like this anyways, but without the naming distinction ( So essentially you have parent scopes, and then any sub-scopes are tags... they don't exist in a hierarchy. IE: I honestly think (on it's own merits) there is a lot to like about this approach - from a simplicity perspective... It does have some of the "endless creativity" issues I mentioned above, but now restricted to sub-scopes, which I think would be far more manageable. But it's taking quite a step away from TextMate scoping...
If it sounds like I strongly prefer TM scopes that's inaccurate. I think at this point it seems TM vs hybrid is very much about one's goals. Hybrid seems clearly better (organizationally) than Labels because it enforces SOME hierarchy and that's good for themes consistency and development and understanding the domain. You may also want to also spend some time thinking about theme compatibility and see if that changes your thinking on any of this. For example... you say you want Super duper. Now characters are broken in every old theme that is only aware of strings. I've considering automated processing for our CSS files (unless they are tagged as "leave me alone") to "fill in" gaps like this... say if we added Thoughts? :-) |
I'm not sure you really desire scopes 4 levels deep for Prism, do you? Is some of this discussion in the abstract - or are you really wanting to move towards a TM level of nuance in your highlighting? I think I always intended to limit it to So I think that in practice hybrid would never really be:
But most likely ever only:
So unless you're truly trying to copy TM theme fidelity you'd never have such deep scopes in practice. To me that also makes the "but we're repeating the names multiple times" disadvantage less of a big deal. |
I think for the v11 development cycle I'm going to try "Scope Prefix" but moving the A title.class (name of a class):
|
Do you all never run into collision issues with conflicting class names outside your hierarchy in Prism.js? |
A related item (at least for us.. I'm not sure how granular Prism can be or wants to be)... given the following:
The string has 3 scopes:
But is that:
Our engine allows for both pretty easily so we need to decide which is recommended/canonical. I'd say the latter is a bit more CSS like... and we've traditionally always highlighted the whole item as a "string"... so updating a grammar to highlight only the begin/end pairing is 1 or 2 lines of code vs switching to an entirely new syntax if one wants to do this same thing with multi-match by giving all 3 items individual scopes. Although it's also possible we soon add a But if there are any reasons to prefer the "flatter" approach... |
Hi! I'd like to help document token types, especially since it'll help a lot with theme creation. I'll admit I'm not as familiar with this repo as I am with prism-themes, but I think I can find my way around enough to write these docs. Just a few questions at this point:
|
Thank you very much @hoonweiting!
If you have any other questions or need help, feel free to ask any time! |
I've started work on it, but I would like some input/help at this point! Is the file name (On a slightly related note, is Prism open to larger website design inputs/PRs? (Not so much of an overhaul, more of a makeover.) It's something I can and would like to help with, though it'll have a longer runway compared to this doc, for instance.) Which tokens should be considered standard tokens? Initially I thought I'd write about all the tokens that I included in prism-theme-template.css, but it occurred to me that they might not all be standard tokens! For example, comment, boolean, number, char, string, url, regex, punctuation, constant, variable, property, operator, keyword, builtin, class-name, function, inserted, deleted, bold, italic, important, prolog, doctype, cdata, namespace, tag, selector, attr-name, attr-value, atrule, entity. Also, do you think there more tokens that should be considered standard tokens that aren't listed at all? Random sampling on the FAQ page probably isn't the best way, haha. Finally, this is very unrelated, but is prism/components/prism-markdown.js Lines 29 to 41 in 8daebb4
I don't think it matters too much but yeah, just something I saw while poking around. |
Yes, sounds good.
The section of embedded languages can stay. It's important.
I think all token names you listed should be standard tokens. Details:
While all of these tokens are specific to certain language types, I do think that theme should support all of them and language definition authors should be aware of them.
No, the list is long enough as is for now. We can also add more later.
First of all, yes. But also, a little no. Could you please open an issue for this and tag all Prism maintainers?
Yep, I made a typo. Thanks for noticing! I'll fix it. |
Ahh okay thanks! I was more worried about the client needing to download more information for each language I use, so I was trying to min-max, in a way. Then again, perhaps I have lost sight of what a 'large' webpage is, and that a couple of kB is really nothing! Plus, I am fortunate to live in a country with relatively fast Internet speeds, so it's even harder to tell. But I am assured now!
Got it! I'm currently using Elm, but I'm sure it doesn't hurt to add more examples later on. Also, I see that Elm supports
Wow! TIL. Thank you! |
Of course! Thank you! |
Hey @RunDevelopment! I was wondering how I could help out with the second half of this issue. For starters I could probably swap out
Huh, I guess that's all the questions I have for now, maybe I'll think of more eventually. And oh yeah, this is probably going to take a while, so how should we go about tracking the progress? This would be especially helpful if more people want to pitch in too! |
Thank you for the offer!
Good idea. Could you make a tracking issue? Just a simple task list like this should be enough. - [ ] abap
- [ ] abnf
- [ ] actionscript
- [ ] ada
- [ ] agda
- [ ] al
- [ ] antlr4
- [ ] apacheconf
- [ ] apex
- [ ] apl
- [ ] applescript
- [ ] aql
- [ ] arduino
- [ ] arff
- [ ] asciidoc
- [ ] asm6502
- [ ] asmatmel
- [ ] aspnet
- [ ] autohotkey
- [ ] autoit
- [ ] avisynth
- [ ] avro-idl
- [ ] bash
- [ ] basic
- [ ] batch
- [ ] bbcode
- [ ] bicep
- [ ] birb
- [ ] bison
- [ ] bnf
- [ ] brainfuck
- [ ] brightscript
- [ ] bro
- [ ] bsl
- [ ] c
- [ ] cfscript
- [ ] chaiscript
- [ ] cil
- [ ] clike
- [ ] clojure
- [ ] cmake
- [ ] cobol
- [ ] coffeescript
- [ ] concurnas
- [ ] coq
- [ ] cpp
- [ ] crystal
- [ ] csharp
- [ ] cshtml
- [ ] csp
- [ ] css
- [ ] css-extras
- [ ] csv
- [ ] cypher
- [ ] d
- [ ] dart
- [ ] dataweave
- [ ] dax
- [ ] dhall
- [ ] diff
- [ ] django
- [ ] dns-zone-file
- [ ] docker
- [ ] dot
- [ ] ebnf
- [ ] editorconfig
- [ ] eiffel
- [ ] ejs
- [ ] elixir
- [ ] elm
- [ ] erb
- [ ] erlang
- [ ] etlua
- [ ] excel-formula
- [ ] factor
- [ ] false
- [ ] firestore-security-rules
- [ ] flow
- [ ] fortran
- [ ] fsharp
- [ ] ftl
- [ ] gap
- [ ] gcode
- [ ] gdscript
- [ ] gedcom
- [ ] gherkin
- [ ] git
- [ ] glsl
- [ ] gml
- [ ] gn
- [ ] go
- [ ] graphql
- [ ] groovy
- [ ] haml
- [ ] handlebars
- [ ] haskell
- [ ] haxe
- [ ] hcl
- [ ] hlsl
- [ ] hoon
- [ ] hpkp
- [ ] hsts
- [ ] http
- [ ] ichigojam
- [ ] icon
- [ ] icu-message-format
- [ ] idris
- [ ] iecst
- [ ] ignore
- [ ] inform7
- [ ] ini
- [ ] io
- [ ] j
- [ ] java
- [ ] javadoc
- [ ] javadoclike
- [ ] javascript
- [ ] javastacktrace
- [ ] jexl
- [ ] jolie
- [ ] jq
- [ ] js-extras
- [ ] js-templates
- [ ] jsdoc
- [ ] json
- [ ] json5
- [ ] jsonp
- [ ] jsstacktrace
- [ ] jsx
- [ ] julia
- [ ] keepalived
- [ ] keyman
- [ ] kotlin
- [ ] kumir
- [ ] kusto
- [ ] latex
- [ ] latte
- [ ] less
- [ ] lilypond
- [ ] liquid
- [ ] lisp
- [ ] livescript
- [ ] llvm
- [ ] log
- [ ] lolcode
- [ ] lua
- [ ] magma
- [ ] makefile
- [ ] markdown
- [ ] markup
- [ ] markup-templating
- [ ] matlab
- [ ] maxscript
- [ ] mel
- [ ] mermaid
- [ ] mizar
- [ ] mongodb
- [ ] monkey
- [ ] moonscript
- [ ] n1ql
- [ ] n4js
- [ ] nand2tetris-hdl
- [ ] naniscript
- [ ] nasm
- [ ] neon
- [ ] nevod
- [ ] nginx
- [ ] nim
- [ ] nix
- [ ] nsis
- [ ] objectivec
- [ ] ocaml
- [ ] opencl
- [ ] openqasm
- [ ] oz
- [ ] parigp
- [ ] parser
- [ ] pascal
- [ ] pascaligo
- [ ] pcaxis
- [ ] peoplecode
- [ ] perl
- [ ] php
- [ ] php-extras
- [ ] phpdoc
- [ ] plsql
- [ ] powerquery
- [ ] powershell
- [ ] processing
- [ ] prolog
- [ ] promql
- [ ] properties
- [ ] protobuf
- [ ] psl
- [ ] pug
- [ ] puppet
- [ ] pure
- [ ] purebasic
- [ ] purescript
- [ ] python
- [ ] q
- [ ] qml
- [ ] qore
- [ ] qsharp
- [ ] r
- [ ] racket
- [ ] reason
- [ ] regex
- [ ] rego
- [ ] renpy
- [ ] rest
- [ ] rip
- [ ] roboconf
- [ ] robotframework
- [ ] ruby
- [ ] rust
- [ ] sas
- [ ] sass
- [ ] scala
- [ ] scheme
- [ ] scss
- [ ] shell-session
- [ ] smali
- [ ] smalltalk
- [ ] smarty
- [ ] sml
- [ ] solidity
- [ ] solution-file
- [ ] soy
- [ ] sparql
- [ ] splunk-spl
- [ ] sqf
- [ ] sql
- [ ] squirrel
- [ ] stan
- [ ] stylus
- [ ] swift
- [ ] systemd
- [ ] t4-cs
- [ ] t4-templating
- [ ] t4-vb
- [ ] tap
- [ ] tcl
- [ ] textile
- [ ] toml
- [ ] tremor
- [ ] tsx
- [ ] tt2
- [ ] turtle
- [ ] twig
- [ ] typescript
- [ ] typoscript
- [ ] unrealscript
- [ ] uri
- [ ] v
- [ ] vala
- [ ] vbnet
- [ ] velocity
- [ ] verilog
- [ ] vhdl
- [ ] vim
- [ ] visual-basic
- [ ] warpscript
- [ ] wasm
- [ ] web-idl
- [ ] wiki
- [ ] wolfram
- [ ] wren
- [ ] xeora
- [ ] xml-doc
- [ ] xojo
- [ ] xquery
- [ ] yaml
- [ ] yang
- [ ] zig I hope you like scrolling :) Since it will probably be mostly the two of us working on the issue, we should both be able to edit the issue (i.e. mark items in the task list as checked). You wouldn't be able to do that I created the issue/comment. |
Sounds good! I'll create a tracking issue later, though I don't expect to work on it until the weekend comes around. I haven't looked at the source files/file history/PRs/commits yet, but I'm wondering if we should get all that information in one place, like maybe inline comments with a note as to why it has no standard tokens or whatever? Oh and I guess a related question is, how can we better communicate the lack of highlighting (with CSS) for these non-standard tokens? I see that there were some issues raised in the past wondering why a certain language wasn't getting highlighted, or why it had so little highlighting, stuff like that. I think the only indication on it on the website is in the FAQ...and hmm, I guess this should be part of the website discussion (soonTM) and not here. Blah, I'll keep that in mind for that discussion in the future. Last question for now, perhaps would this be a good opportunity to start some work on #2850 as well? |
I wonder whether we even need to? Non-standard tokens without standard aliases are somewhat rare. A comment explaining their function (if non-obvious) would be nice but I don't think that we need to require comments. So add comments if you think they are needed, I guess.
Expect for |
Ah, sorry for being MIA the past few weeks, but I'm back! (Well, maybe when I wake up in the afternoon.) I'll follow your lead with what you've done so far, thanks for doing so much!!! |
Okay hi I'm actually alive again!! I've decided to look at LOLCODE first (lol), and hmm, I'm having a little trouble with it. A variable in LOLCODE may be defined as such: I HAS A <variable name> ITZ <value>
prism/components/prism-lolcode.js Lines 46 to 49 in adcc878
It's really been a while since I looked at regex (or Prism...), but is this supposed to capture whatever comes after I'm asking this because I can't tell whether it's a typo ( |
Welcome back :) Not an expert, but AFAIK, implicit variables are all variables that you did not define yourself. So I would also read the spec as "there is a variable called So the |
Ah okay, thank you for the explanation! LOLCODE is good to go then! |
I've got another question, I'm looking at editorconfig right now, and it looks pretty good (other than the broken link to the docs), I'm just wondering if it would be more accurate to use prism/components/prism-editorconfig.js Lines 18 to 23 in 8476a9a
|
I was wondering the same. The problem is that we don't have token names for key-value pairs. INI, systemd, and some other simple key-value config formats have the same issue. Even config formats like JSON, YAML, and TOML are affected to a lesser extent. Frankly, I don't have a good general solution for this. |
Ah, I had a different interpretation of the 'categories' on tokens.html actually; literally they're just categories, to break the table into more digestible portions, and to a smaller extent, sort of to show where those tokens are likely to be found. (Personally it took me a while to find some of the tokens when creating my first theme!) I do not find the categories to restrict anything. An example of tokens not being restricted to its category (besides prism/components/prism-brainfuck.js Lines 6 to 13 in 220bc40
In the case of JSON, I'm not familiar with INI and systemd, but I just took a look and it seems like they use As for a general solution...I'm really not sure! I tried looking at YAML and TOML a few hours ago, but hooo I'm not prepared to look at those regexes yet! :") |
I was referring to the description of That being said, giving them Regarding brainfuck: Standard tokens can be used as aliases to give styles to non-standard tokens. It's unusual to use
Yeah, YAML especially... YAML is a really complex language, so maybe ignore those regexes for now. If you absolutely needed to understand YAML's regexes, I would recommend reading through YAML's spec first. |
I see! I wanted to suggest revising the description, but I can't think of a better one at the moment; and not to mention, maybe I shouldn't be 'bending the rules' to begin with, haha.
Yeah! The whole aliasing semantics thing (#354 (comment)). Really makes me wonder if the descriptions in tokens.html can be improved... It doesn't feel like there's a lot of room for semantic aliasing at the moment, but I don't know, I'm not a literature/language student, and have no business writing dictionaries! 🤪🤪🤪 Also, I'm looking at git right now, I wanted to add some aliases to some of the non-standard tokens (taking inspiration from the comment linked above), especially since there's been at least three separate issues filed for the seeming lack of highlighting for git blocks. However, I don't really know whether I should proceed right now, given the above discussion, and Prism's intention of leaving the theming to the user (#1615 (comment))? I will, however, submit a PR for editorconfig! *pretends to not see YAML* |
Good idea. It might be enough to say that
We should probably hold that off for another day. Right now, git is a strange mix between Diff and Shell sessions. I don't know whether we should keep it like that. Maybe we could restructure git to add diff support for Shell sessions instead of being its own language? Idk. |
You've got it! In addition to that, it'd be nice to look over the other descriptions too. Something that's been bugging me is how restrictive the descriptions feel, like only one definition is locked in. Some words have multiple definitions, and maybe we can try having alternative descriptions too? I don't have any solid suggestions right now, so maybe this will not bear any fruit after all!
Got it! I might not be very helpful here, because I very much prefer using graphical interfaces instead of the command line where possible. But now that you mention it, what about git and Bash? |
That's actually a good thing, IMO. Tokens have semantic meaning, so it's better for them to narrow than to encompass many different concepts. We already have a token that is semantically ambiguous:
Sure, we still get a pretty color, but it's most likely not what the artist that created the theme intended.
Yep, and it's a problem. E.g.: There are multiple languages that have concepts called "tags" that are not markup tags. |
Ah hmm... To be really honest I've not used a namespace in HTML before, but looking at the Wikipedia article, I suppose we can move the As for how it looks, maybe this is a ridiculous suggestion, but perhaps we could set the opacity of
I see, perhaps we can move it out of the markup languages category too. And for tags specifically, that's where I had the problem of restrictive definitions, because, to me at least, when I think of 'tags', I think of labels, like price tags. So aliasing There is also I can submit a PR to make these changes to tokens.html if you're in agreement! |
You assume that there is a good way to do that :) Unfortunately, there isn't. There are 2 problems that prevent this from being a simple CSS trick:
The best way around this problem might be to narrow down the semantic meaning of standard tokens using aliases. So instead of using just This is pretty much what we are doing already with semantic aliasing (e.g. This would allow standard tokens to be more flexible/vague by making the descriptions/meanings of combinations very specific/narrow. The best thing about these combinations is that they include a standard token, so they are an opt-in mechanism for themes to support for granular highlighting. This means that we can add as many combinations as we like without breaking any themes. Thoughts?
Really? I would have said that a |
Damn, you're right! I would not have noticed the fallacies in my suggestion haha
Your "extended standard tokens" (gotta sleep on that name) makes a lot of sense, and feels like something (similar to?) you and Josh discussed in April, or at least, what I can grasp of it! It does seem to tie a bit into #2850 as well? I mean, maybe I'm misunderstanding somewhere, but it looks like a way to ease into #2850, and killing two birds with one stone without breaking other things sounds great! Also, capturing and listing (some of) these non-standard tokens would be useful too, so there would be some consistency/re-use of non-standard tokens across languages in a way. Besides that, I think I'm really hitting the limits of the level I can think at right now. I'm happy to help with the code/docs to whatever extent I can, but I'm sorry I can't quite contribute to the conversation! :")
Right, it's a hash. As a layperson, both work for me! |
You're right, it does. These "extended standard tokens"/token combinations (let's hope this name doesn't stick) are pretty much what I wanted and (if implemented) would resolve #2850.
Good point.
Thank you so much!
Oh, I think you contributed quite a lot to it already. Thank you! |
I don't have a better name for these token pairs yet, but I was wondering if we could avoid naming it altogether. Currently I'm picturing tokens.html to contain these sections (very, very rough descriptions):
This way, it might be possible to avoid naming the "extended standard tokens"! 🤪 Although, giving these "extended standard tokens" an actual name would help a lot in discussions! |
What Highlight.js ended up doing:
By having the 1st party/3rd party split we can say "hey lets keep it official in core" and with 3rd party grammars say "do whatever you think is best for your grammar, no rules at all"... I don't think Prism has that luxury though. |
Naming things is powerful. They aren't just non-standard tokens. A non-standard token is just that, a single token name that isn't a standard token name. However, these combinations form something new, so they need a name. As @joshgoebel pointed out, these combinations are somewhat similar in function to Highlight.js' sub scopes. So "sub scopes" might be a good inspiration for the name of these combinations, though I'd rather avoid calling them "substandard tokens". As for the structure of
|
Happy new year! 🎉 I get your point. Unfortunately, I still don't have a better name for it! I thought of 'advanced tokens', like how 'advanced search' is a more specific version of 'search', but I feel that the word 'advanced' might seem a little daunting. 'Combinators' / 'Combinations' sounds promising, though! As for the structure of |
I thought about a name again. How about "composite tokens"?
I think the word "composite" it fits pretty well for our combination of non-standard and standard tokens. |
Yeah, composite sounds really suitable! 😀 |
Motivation
Themes depend on Prism producing tokens with specific types (or aliases). Right now, we do not guarantee or document any of those types.
Description
Document all standard token types (e.g.
keyword
,comment
, ...). It should explain the general concept behind each token type and give at least one example.The documentation should also include how languages are embedded. Example.
We should also guarantee that these concepts within languages are guaranteed to use these token types. (E.g. we guarantee that keywords always have a
keyword
class.) It might sound like we already do this but this is not the case right now. (E.g. we have many languages with operator keywords (e.g.NOT
in SQL) that do not have akeyword
class.)The text was updated successfully, but these errors were encountered: