Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bash improvements #1443

Merged
merged 41 commits into from
Jun 10, 2019
Merged

Bash improvements #1443

merged 41 commits into from
Jun 10, 2019

Conversation

alice-mm
Copy link
Contributor

First and foremost, sorry for the rather large pull request. Most of these edits were made while writing some kind of Bash tutorial and I initially had no ambition of contributing. I didn't even have a non-pro GitHub account.

I was using Prism to write this thing. I had tons of fun, but I also noticed that lots of things were missing. If I recall correctly, even the esac keyword (used to close “switch” statements) was nowhere to be found despite case being here.

To sum most of these things up, here's a HTML document that can be used to do some kind of “before / after” comparison:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <link href="prism.css" rel="stylesheet" />
  <style>
    .token.variable {
      color: #9ec6d9;
    }

    .token.keyword {
      color: #4e9ce6;
    }

    .token.punctuation {
      color: #eaeaea;
    }

    .token.builtin {
      color: #fd96b3;
    }

    .token.operator {
      color: #d7cf71;
    }

    .token.string {
      color: #9bea81;
    }

    .token.function,
    .token.function-name {
      color: #ff934f;
    }

    .token.number {
      color: #0acc13;
    }

    .token.constant {
      color: #ee3928;
    }

    .token.comment,
    .token.shebang {
      font-weight: normal;
    }

    .token.entity {
      cursor: initial;
    }
  </style>
</head>
<body>
  <script src="prism.js"></script>
  <pre><code class="language-bash">#! /usr/bin/env bash
# More permissive shebang ↑

# Commands
column apt tac sh shellcheck shuf zenity zsh

# Keywords
esac

# Added builtin category; moreover, some builtins were neither in keywords nor in functions
unset local

# Missing special variables
$$ $#

# Here-documents:
# * support for “&lt;&lt;-” (also added in operators)
# * no expansion for version with quoted beginning tag (so no colours for dollared stuff)
&lt;&lt;- TXT
a $b $(echo)
TXT

&lt;&lt; 'TXT'
a $b $(echo)
TXT

# “+=” in operators outside arithmetic environments (append to string or array)
t+=('a')
s+='b'

# Allowed “&lt;(” and “&gt;(” (process substitution) before keywords, functions, builtins, booleans…
&lt;(yes) &lt;(if true; then :; fi) &lt;(echo) &lt;(true)
&gt;(yes) &gt;(if true; then :; fi) &gt;(echo) &gt;(true)

# Added operators within brace expansion
${x:1:2} ${x:-a} ${x:=a} ${x:?a} ${x:+a}
${!x} ${x/a/b}
${x#*a} ${x##*a} ${x%a*} ${x%%a*}
${x^a} ${x^^a} ${x,a} ${x,,a}

# Prevented “$!!!!” and “$????” from being read as long variable names
$!!!! $????

# Allowed comments in command substitution
$(
    # This is an echo
    echo
)

# Added highlighting of “\n” and such within strings
echo '1\a2\b3\c4\e5\f6\n7\r8\t9\v'
# + support for bytes with octal values (1–3 digits in 0–7)
echo '1234\056789'
# + same here but in hex (1 or 2 digits in 0–F)
echo 'abc\xdef'
# + similar things from printf's manual
printf '123\456789'
printf '\uABCDEFG'
# + escaped double quotes
"a\"b"

# Added “\” to punctuation marks, as it is used to break long lines of code
echo \
    'a'

# Highlighted some common environment variables
IFS='a'
$PS1 "$UID" ${BASH_SOURCE} "${LC_NUMERIC}"

# Highlighted square brackets as punctuation within brace expansion (array access)
${t[i]}

# Highlight names directly following “for” or “select” as variable names
for var in a b c
do :; done
select var in a b c
do :; done

# Highlight what's on the left of assignments as variable names while still detecting environment variables
var='foo'
arr+=('bar')
IFS='a'
PATH+=':a/b/c'

# Recognize function names as such in their declaration
function foo { :; }
foo() { :; }
function foo() { :; }
# Not a function:
foo { :; }

# Redirections and file descriptors
a |& b &&gt; c
a &gt;&2
a 2&lt;&
# + several exotic file descriptor manipulations from Bash's manual.

# Did my best to prevent operators that use multiple characters from being perceived as several small operators
a &lt;&lt;&lt; b &&gt;&gt; c
a && b || c
</code></pre>
</body>

I added a tiny bit of CSS in order to make it easier to distinguish things that used the same style in the theme I loaded (“Tomorrow night”).

I had to update / adapt tests and used the opportunity to add new test cases covering new features.

I did my best but I'm no JS expert, so there's probably room for improvement, but I have to say I like the result a lot. To check more realistic examples, you can simply take a look at the snippets from the tutorial-ish document I was writing when customizing this language definition.

@Golmote
Copy link
Contributor

Golmote commented Jul 6, 2018

Hi! Wow this looks great! I'll try to do a proper review this week-end.

From a really quick look, I can already tell that some indentations are weird (note that we do use tabs for indentation in Prism), and that you should probably add a builtin_feature.test test file just like the one for the keywords.

I also noticed that lots of things were missing.

Most of the time, components need to be actually used extensively to reveal their flaws. I'm glad you took the time to improve the component to fit your needs and even submitted a PR for it! Thanks!

Golmote
Golmote previously requested changes Jul 7, 2018
Copy link
Contributor

@Golmote Golmote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did an amazing work here! Thank you so much for this.

Would you mind taking a look at my comments?

Also, as I mentionned earlier, the indentation should use tabs everywhere.

On a side note:
Issue #1457 popped up recently. I wonder if we should allow one nested level of parentheses inside Command Substitution variables, since this seems to be a common use case.
It should lead to a regexp like:
/\$\((?:\([^)]+\)|[^()])+\)|`[^`]+`/ The regexp gets a bit uglier but it's the cost to pay... Would you mind taking a look at it in this PR?
It won't fix the issue of nested double quotes that I've seen on multiple occasions in your document, though, but that's a start.

components/prism-bash.js Outdated Show resolved Hide resolved
components/prism-bash.js Outdated Show resolved Hide resolved
components/prism-bash.js Outdated Show resolved Hide resolved
components/prism-bash.js Outdated Show resolved Hide resolved
components/prism-bash.js Outdated Show resolved Hide resolved
tests/languages/bash/process_substitution_feature.test Outdated Show resolved Hide resolved
tests/languages/bash/redirections_and_fd_feature.test Outdated Show resolved Hide resolved
tests/languages/bash/var_assign_feature.test Outdated Show resolved Hide resolved
tests/languages/bash/var_in_for_and_select_feature.test Outdated Show resolved Hide resolved
tests/languages/bash/func_def_feature.test Outdated Show resolved Hide resolved
@alice-mm
Copy link
Contributor Author

alice-mm commented Jul 7, 2018

Thanks for your time and for those comments. I'll do my best to fix all that.
Sorry for the indentation. I was pretty sure my editor was using tabs, but I guess some spaces slipped through or something.
There's a lot of work, but I guess that was to be expected with such a monolithic PR 😅 It won't be trivial to delve into these things again but I don't have much choice.
Regarding #1457 (after a quick look), I'm not sure I get what happens 😮 It looks like the first single quote within $(…) wasn't considered as the beginning of a string while the second one was, thus gobbling nearly the whole script in a pseudo-string. Weird.

@Golmote
Copy link
Contributor

Golmote commented Jul 8, 2018

Re: #1457, the issue is that the $() match ends at the first closing parenthesis ) encoutered, instead of the second one in this case. But don't worry if you don't want to take time for it, it can be fixed later in a separated PR.

@alice-mm
Copy link
Contributor Author

alice-mm commented Jul 8, 2018

components/prism-kotlin.min.js gets modified when I run Gulp, even though it does not seem related to Bash in any way… I'll refrain from committing it for now; warn me if it is required.

@alice-mm
Copy link
Contributor Author

alice-mm commented Jul 8, 2018

For #1457, note that if you allow a pair of parentheses within $(…), the opening one should not be right after the $( as it would look like arithmetic expansion ($((…))).
And yeah I'm not sure it would be wise to hammer new features in this PR. 😄

@Golmote
Copy link
Contributor

Golmote commented Jul 8, 2018

components/prism-kotlin.min.js gets modified when I run Gulp, even though it does not seem related to Bash in any way… I'll refrain from committing it for now; warn me if it is required.

Hm that's weird. Please don't commit it.

Are there actual differences in the git diff?
What's your version of UglifyJS?

D:\Documents\prism>npm list uglify-js
[email protected] D:\Documents\prism
`-- [email protected]
  `-- [email protected]

@alice-mm
Copy link
Contributor Author

alice-mm commented Jul 8, 2018

$ npm list uglify-js
[email protected] /home/alice/prog/prism/prism
└─┬ [email protected]
  └── [email protected]

Git diff shows a bunch of differences, but as it is a minified file it's not really readable.
Hm… It seems the only difference is that the keyword to is added in the “after build” version. Did someone add that in a PR and forgot to commit the minified version?

@alice-mm
Copy link
Contributor Author

alice-mm commented Jul 8, 2018

I think I took most of the comments into account. I have yet to try to simplify the operator list, though, and I did nothing regarding #1457. I'll have lunch, for now. 😄
Thanks again for your guidance so far.

@alice-mm
Copy link
Contributor Author

alice-mm commented Jul 8, 2018

Your regexp for (…) within $(…) seems to work wonders.
image
I don't see anything else that needs to be done, now, but maybe I forgot something. I'll let you check all that.

@Golmote
Copy link
Contributor

Golmote commented Jul 8, 2018

Found the Kotlin commit here: 41e3d6a It was done right after you opened this PR, so your version is a bit behind master. Mystery solved!

@alice-mm
Copy link
Contributor Author

alice-mm commented Jul 8, 2018

Yeah, I haven't pulled from Prism's repository for a while. Does that mean that I should now, or is it OK as long as the only files I edited did not change in the meantime? Travis seemed happy so I did not give it much thought.

@Golmote
Copy link
Contributor

Golmote commented Jul 8, 2018

Don't worry, as long as the merge can be made without conflict, this is fine.

@Rychu-Pawel
Copy link

Hey Guys. Any ETA on this? @Golmote @alice-mm

@alice-mm
Copy link
Contributor Author

Any ETA on this?

Not sure what exactly is happening. According to the tag changes, it seems @Golmote acknowledged my changes and intends to check them or to let someone else do so. I'm on standby, should any new remark surface.

@alice-mm
Copy link
Contributor Author

alice-mm commented Feb 4, 2019

Hi everyone. Wanted to see how things were going. Noticed conflicts sprouted after #1577 was merged. I tried to compute the union of the commands listed in the two PRs, still excluding shell builtins that tend to be mistaken for external commands. Looks like the conflicts are gone, now. Pfiou. Not used to working with multiple Git remotes. 😅

/cc @Golmote

Still not sure why this PR is stalled right at the final-ish review stage. 😢

A few months ago, I used my version again (for a Reveal.js slideshow to teach Bash stuff to my workmates) and didn't notice anything weird. Prism is fun.

@mAAdhaTTah
Copy link
Member

@RunDevelopment Would you mind taking a pass through this PR and providing a final sign off & merge if everything looks good? These look like good improvements, so might be nice to get them landed.

@RunDevelopment
Copy link
Member

@mAAdhaTTah I'm quite busy at the moment, so it might take a while until I get to it.

@mAAdhaTTah
Copy link
Member

@RunDevelopment No worries! Just wanted to get your attention.

@alice-mm This just fell through the cracks. I know Golmote's been busy as well so he probably just didn't have a chance to get back to it. I unfortunately have little / no regex skills, so I usually defer to my co-maintainers to review that stuff.

@alice-mm
Copy link
Contributor Author

alice-mm commented Feb 4, 2019

No problem! Indeed, his profile seems to show that his activity plummeted around the time of his reviews of this PR.
Thanks for reacting so fast. 🙂

alice-mm and others added 25 commits June 10, 2019 13:15
PrismJS#1443 (review)

“All themes except Tomorrow Night and Twilight highlight
builtin tokens and string tokens with the same color.
Syntax highlighting with just one color is kind of useless,
so I suggest that we give the builtin token a class-name alias
to change the style.”
@alice-mm
Copy link
Contributor Author

Couldn't help myself; did that at work during my lunch break. I hope I correctly understood what was to be done. 😣

Copy link
Member

@RunDevelopment RunDevelopment left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!

Thank you very much for contributing!

@RunDevelopment RunDevelopment merged commit 363281b into PrismJS:master Jun 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants