Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for text folding #1875

Closed
albertz opened this issue Jan 3, 2019 · 62 comments
Closed

support for text folding #1875

albertz opened this issue Jan 3, 2019 · 62 comments
Labels
type/proposal A proposal that needs some discussion before proceeding

Comments

@albertz
Copy link

albertz commented Jan 3, 2019

It would be nice to support text folding, i.e. via some escape codes surrounding some block of text, that you can fold that away. Similar to e.g. how Travis does it (example).

Some related links:

@Tyriar
Copy link
Member

Tyriar commented Jan 3, 2019

I don't think we'd want to support this unless the sequence is widely used by programs and/or somewhat standardized.

@Tyriar Tyriar closed this as completed Jan 3, 2019
@albertz
Copy link
Author

albertz commented Jan 3, 2019

Well, unless there is a terminal emulator which introduces such feature, why would a program make use of it? So this is why a project like xterm.js would have to introduce such feature. This is how it works for all kinds of extended escape codes (see e.g. my links here).

I thought that xterm.js wants to be a modern terminal emulator. By your logic, first some other (widely used) emulator would have to introduce such a feature, then it has to become widely used by programs, and only then xterm.js would adopt it. Which means that xterm.js would become kind of unmodern by then.

I'm quite sure that there are apps depending on xterm.js which would like to have such a feature, e.g. Hyper.

@egmontkob
Copy link

There are a gazilliion of various terminal extensions out there. E.g. the linked DomTerm page lists maybe about 50 custom escape sequences. I think it's fair to leave it for each terminal emulator's developers to decide which one they adopt and which one they don't.

As a VTE developer lurking around here, I love the idea, but I really wish there had been some cooperation between some popular terminal emulators for the design.

It's unclear to me how the \e[16u and \e[17u escape sequences defining the arrows are coupled together with the \e[83;...u (*) ones, and what happens in cases when they don't arrive in the "expected" order (e.g. the 17 just doesn't come), or the cursor is moved in between (e.g. moved backwards between the opening and closing 83).

(*) The attached Java program doesn't print 83, only 16 and 17, so I probably don't understand something correctly. It's unclear to me what sequence specifies exactly the toggleable block.

I don't understand the 1 or 2 characters story between \e[16u and \e[17u, e.g. how is an emitting application supposed to know whether the terminal emulators knows the "show" counterpart of the specified "hide" character? Also, doesn't this construct introduce the first ever case in terminal emulation when a character outside of escape sequences does something else than prints itself? Wouldn't it lead to unforeseen troubles? I'm also wondering why there's a need for this, and whether it would be better to let the terminal emulator pick its own preferred graphical representation (not necessarily something that could be printed by the app, perhaps some stock UI gadget).

The choice of the \e[...u framework is a pretty unfortunate one due to its potential clash with SCORC, in a similar way that DECSLRM kinda-sorta conflicts with SCOSC (see e.g. VTE 48).

My overall feeling is that DomTerm suddenly wants to do a whole lot of things at once that no terminal emulator did before (I, for one, would argue that nesting isn't necessary, it's more than enough to implement collapsing on the outmost level for the vast majority of use cases, more complex ones should done by terminal-based or graphical applications rather than terminal emulators), and (impression based on the fresh comments at DomTerm 54) the way it does them is not really properly thought through and mature. It's just my feelings without having tried it or closely studying it, not a solid opinion.

Of course, only time can tell if a feature is going to be successful or not.

@albertz
Copy link
Author

albertz commented Jan 3, 2019

Yes, I agree with most of what you said (except the nesting: I think that this is useful and also simple, once you have that feature in any form).

  • This is why I created this issue: To discuss the possible options, and to ultimately get this feature (widely available in common terminal emulators).
  • I'm actual not too happy with DomTerms implementation. I just linked it as a reference that there is already a terminal emulator which implemented this. I was not suggesting that xterm.js should adopt the same escape code. I'm actually a bit confused about the exact definition of it myself (DomTerm 54). I tried to make use of it in an own app, but either I misunderstand how to use it, or it's buggy, or both (by trial and error, I have some working solution now here).
  • If there are people who want to have such a feature, someone has to come up with a suggestion for an escape code for this, and some terminal emulator has to implement this (and hopefully others will follow).
    • And there are definitely people who are interested in such a feature: @albertz, @egmontkob, Travis, DomTerm, Hyper, Final Term, and probably many more.
    • Existing solutions are probably not optimal:
      • DomTerm has one suggestion. But as discussed, probably not ideal.
      • Not sure about Final Term.
      • Travis solution is probably not the best solution for terminal emulators (no escape code, just a custom string).
    • I.e. someone should come up with a clean definition. I was suggesting (implicitly by my issue here) that xterm.js could lead this. Because only if a widely used terminal emulator introduces such a definition and feature, it has a chance to get adopted.
  • I.e. closing this issue here seems a bit strange to me. Does that mean that xterm.js does not want to lead such a role? I would suggest to reopen this.

@egmontkob
Copy link

Nice screenshots of FinalTerm, but you know the project is discontinued?

@albertz
Copy link
Author

albertz commented Jan 3, 2019

Yes I know. But this was also only for reference. I actually found it in the code now (search for collapse_button and is_prompt_line), and it seems that there is no special escape code for collapsing/folding, just for marking the prompt line. I definitely do not suggest to implement it like that.

@Tyriar
Copy link
Member

Tyriar commented Jan 4, 2019

I'm also wondering why there's a need for this, and whether it would be better to let the terminal emulator pick its own preferred graphical representation (not necessarily something that could be printed by the app, perhaps some stock UI gadget).

To me it seems like generic folding doesn't seem to be that useful, that's why many programs have a verbose output option. What would be more useful imo is ways of flagging ranges, for example like iTerm's shell integration which lets the terminal know where the commands are, allowing a terminal emulator to fold the output if they wish. Something I wished there was earlier was the ability to flag a section of output with an alt text so screen readers read that instead of a graphical progress bar for example.

I.e. closing this issue here seems a bit strange to me. Does that mean that xterm.js does not want to lead such a role? I would suggest to reopen this.

Terminals supporting whatever they want seems like it's just going to lead to a more fragmented mess. I would hope in the future some form of standards body would arise to move terminals forward, as right now they seem somewhat stuck in time. I'm certainly open to being involved in a more coordinated effort though.

@egmontkob am I right in my assessment of the world here? Any insights here as I'm still relatively new to the scene 😄

@Tyriar Tyriar reopened this Jan 4, 2019
@Tyriar Tyriar added the type/proposal A proposal that needs some discussion before proceeding label Jan 4, 2019
@albertz
Copy link
Author

albertz commented Jan 4, 2019

I uploaded my use case for folding here (scroll down to the screenshot in the Readme), which is a Python package to print a Python stack trace, with extended information. You see an example there for the standard MacOSX Terminal, and then a screencast with DomTerm, which uses folding. I think this is pretty useful, at least for me (it provides a much greater comprehensive overview of the stack trace, while still providing the details if you want to see them). And this is also an example that nesting can be useful (again, at least for me).

Btw, also Travis supports nested folding.

@egmontkob
Copy link

I would hope in the future some form of standards body would arise to move terminals forward

Some of us are working right now on creating such a collaborative platform, expect an announcement/invitation soon :)

@Tyriar
Copy link
Member

Tyriar commented Jan 4, 2019

@egmontkob 👌

@jerch
Copy link
Member

jerch commented Jan 5, 2019

Intruiging idea, here are my first 2cents/questions:

  • Where to put the folding sign in the view representation?
    If an emulator wants to support this it needs some way to layout things so the user understands whats going on. A typical position for folding signs is another bar on the left side as seen by almost every GUI code editor. Since a terminal is all about text interfaces this raises the question whether it should be part of the text view itself or an outer GUI driven element (much like a scroll bar in turbo vision vs. a scrollbar on the terminal view).
    Being part of the text itself (seems DomTerm does that) has several issues - it will mess with the pty line editor, thus the slave prog will have to unset ICANON and control the output itself or the terminal would have to resize and indent the wraps. In fact nothing really gained here as this is already perfectly doable, just use a curses driven lib which deals with foldable paragraphs and does all the low level stuff for you (like insert/remove paragraphs and registering mouse/key actions).
    Doing the fold bar on a higher GUI level has issues as well - first its a waste of space if always on (and hardly being used). Second it introduces "outer world" into a primarily text driven env - how should a fullscreen text only emulator render this? Linux console? They would have to sacrifice the first text column space and put some signs there. This again begs for the question why this is not done by a slave side lib, if a prog really needs this.
    Nested folds (beside the question whether they are needed at all) raise the question how to represent the level. A simple solution would not deal with that and just put the fold signs in one column. Showing the level in a tree like thingy seems to be complete waste of space to me.
  • How to deal with folds when incoming or other typical actions like copy&paste? Shall the folded state be copyable? Is the incoming data always folded/unfolded/yet another sequence to define this?
  • How to deal with scrollbuffer here? What happens to truncated folds due to scrollbuffer limits? Shift the start marker downwards and remove it when it hits the corresponding end marker? This gets really funny for nested folds.

Those are only first surfacing thoughts/questions regarding a possible integration into what terminals do. Yet this "5m distance perspective" alone leads to cumbersome constellations, I think folding will only work with good user experience when properly spec'ed beforehand and implemented in similar ways across different emulators.

@PerBothner
Copy link
Contributor

A "live" dome is here.
This is actually a "Saved as HTML" snapshot, but wrapped in JavaScript so text folding and dynamic resize (pretty-printing) work on the snapshot, just as they would in a live terminal.

@jerch
Copy link
Member

jerch commented Jan 5, 2019

A bit more detailed this time - still a list of wild thoughts from my side with some proposals/ideas:

  • need for nestable folds
    This is likely to interfere with all other aspects and might be good to be clarified upfront.

  • visual representation
    My vote would go to a single column thingy (fold bar), the way it shows up should be left to the emulator's needs (either as text column or as a GUI sidebar). Imho it should not be part of the active terminal view as it will mess way to much with the pty.
    Another question is whether a folded part should show some collapse info in the terminal beside the fold sign in the bar, and what to be shown there. Things that come to my mind here:

    1. default to truncated first line of the folded content, maybe with some additional folded "markup"
    2. make it customizable through the start marker

    Imho text attributes should be preserved on folded content. Same goes for more complicated actions during input (like OSC/DCS commands) - their results should stay in place if they are bound to parts of that terminal buffer.

  • accessibility
    We gonna need at least 3 new key combinations:

    • jump to next fold
    • jump to prev fold
    • toggle fold

    This not an easy task as the combinations should not be used by any other popular prog yet, and will be occupied prolly for the time being after being introduced.
    Mouse support should only be optional, as pure text driven emulators might not have mouse support at all. The mouse should not interfere with registered mouse protocols in the terminal view, thus its basically limited to the fold bar (maybe extend this optionally to the terminal view if no mouse protocols were registered by the slave prog).

  • behavior/integration with other terminal parts

    • copy&paste behavior
      I would favour a WYSIWYG behaviour here, means peeking into folded terminal buffer would not contain the folded content. I am pretty sure there are more use cases for an "always unfolded" copy behavior, imho this needs to be discussed with users' expectations in mind.
      The start/end marks should not be part of the copied content as they are just style hints for a terminal.
    • scrollbuffer interaction
      No clue yet myself how to deal with folds here, it basically boils down to the question whether to remove a fold at once or line by line from the scrollbuffer if the limit is hit. For this to work the scrollbuffer will have to be more clever than it used to be for most emulators. Not sure yet if a fold, that spans the "scrollbuffer - active view border" will introduce issues, being able to jump with the cursor into a fold region might screw up things (just a wild guess atm).
    • cursor state / position in terminal buffer (fold in the active terminal view)
      Folds in the active terminal view need some additional definitions, as they might mess up the buffer state when used with common cursor sequences. A question that arises is where to allow setting fold marks or how to deal with them while the cursor is in the middle of a line. From the motivation of the fold idea it seems logical to only support folds on line level, thus a spec would have to cover this "faulty" input and propose some default action (like autowrap to next line on any fold marks).
      Furthermore after the start mark was set the cursor could jump and place the end mark above that line, this also needs some state recovery (like swap the closest marks).
      Additionally the cursor could span the marks over empty lines, the spec needs to tell whether those lines should treated as always collapsed or "realized" (maybe with line feeds).

Last but not least a halfway failsafe escape sequence should be found. This would have to deal with all sorts of faulty states like dangling marks and such. If going with the start/end mark thing this also raises the question how to deal with the open start mark while the end mark is not set yet.

@jerch
Copy link
Member

jerch commented Jan 5, 2019

@PerBothner Thx for the demo, looks pretty nice. Main question from my side - how do you deal with terminal size and the pty here?
Modifying and formatting the data (inserting chars + autoindentation) without explicitly requested by the prog is way to much alteration of the original data for my taste.

@PerBothner
Copy link
Contributor

"Main question from my side - how do you deal with terminal size and the pty here?"

The demo uses no pty and no server, except to serve static html, js, and css files.

The resizing/reflow all happens in the browser. It's similar to the reflow that some terminals (e.g. Gnome Terminal) do for wrapped lines - see issue #622. However, the application outputs markers into the output stream to mark structural elements. These are used to guide the line-breaking/re-flow - basically Lisp-style pretty-printing on-the-fly. These work even when the application (or the pty) is dead.

"Modifying and formatting the data (inserting chars + autoindentation) without explicitly requested by the prog is way to much alteration of the original data for my taste."

It is explicitly requested by program, using special escape sequence.

@jerch
Copy link
Member

jerch commented Jan 5, 2019

The demo uses no pty and no server, except to serve static html, js, and css files.

So this is not meant to run with a real pty and a shell?

@PerBothner
Copy link
Contributor

"So this is not meant to run with a real pty and a shell?"

The folding/pretty-printing feature is definitely meant to be used with a pty and a shell. However, the demo does not use a real pty and shell. Think of it like an animated gif recording of an actual pty+shell session - but it's interactive.

@albertz
Copy link
Author

albertz commented Jan 5, 2019

My comments:

  • I would vote for having the folding buttons in the text itself. That makes it easier on every side (the terminal emulator does not have to introduce any specific UI area for this), and also gives more control/freedom for the shell or the tool which wants to make use of this.
  • I think that nesting is very useful, and also not really problematic to support, in any of the possible cases.
  • I'm actually fine with mouse-only support. How is this with other features like hyperlinks? But introducing a keyboard shortcut should also not be too problematic (maybe just like some kind of focus for all kind of buttons, including also hyperlinks).
  • Copy & paste behavior: This was already a bit discussed here. I think both cases (copy only visible text, or copy all the text) can make sense under certain circumstances. That is why think that there should be two separate escape keys, so that the app developer can chose what makes more sense for a particular use case.

@jerch
Copy link
Member

jerch commented Jan 5, 2019

I would vote for having the folding buttons in the text itself. That makes it easier on every side (the terminal emulator does not have to introduce any specific UI area for this), and also gives more control/freedom for the shell or the tool which wants to make use of this.

It would not. In default ICANON mode this would mess with the lines sent from the pty - the pty has a notion of the actual terminal size and might send extra control chars like '\r' when the end is reached, if the terminal decides to add chars on own behalf for wotever reason this will fail badly. This not a voting thing, its more about being technically feasible.
Since you mentioned the shell - from the perspective how the responsibilities currently are shared between terminal and shell this feature would suit better to the shell than the terminal. Since we have no widespread shell doing it atm, I wonder if this is needed/highly requested at all.

@PerBothner
Copy link
Contributor

An admitted problem with DomTerm's folding is that it isn't as clearly specified as I'd like. There are two main use-cases I'm focusing on, and the constraints are different:

  • Folding the output of a "command" in a shell or other REPL (along with input lines after the first). In this case we usually don't have the option of changing the shell internals, but we usually have the option of setting a prompt string that can be used to delimit commands, as well as distinguishing prompt, input, and output from each other. This is similar to some other terminals' "shell integration". In this case the fold button is in the prompt string, and the terminal uses the command delimiters to figure out what to hide.

  • A REPL that prints out some non-trivial data structure. For example the console of JavaScript debuggers in Chrome or Firefox. In this case nesting is obviously useful. When a fold button is pressed, DomTerm looks for "foldable sections" that are at the "same level" as the button. A foldable section can be delimited by an 83 escape code (see DomTerm spec) or a "logical pretty-printing block". The definition of "foldable section" needs better specification and documentation.

An enhancement of the latter is "lazy show". Some part of the output is hidden - and the application just sends a placeholder button, rather than the actual data. When the output is made visible, the terminal sends an escape sequence to application, which responds with a commands to update the newly-visible section of the output. This would be very useful for very large or "infinite" (cyclic) data structures. This is not implemented in DomTerm, but I have the outline of a protocol I can explain on request.

  • Not implemented, but @albertz has suggested/requested an option to specify an string name for both buttons and foldable sections: clicking on a named button flips all sections and buttons that have the same name. This enables one button to fold multiple related sections, even ones produced by no-longer-running applications. This is very general, but a bit more complicated for applications, so there should also be the simpler commands.

@PerBothner
Copy link
Contributor

PerBothner commented Jan 5, 2019

"It would not. In default ICANON mode this would mess with the lines sent from the pty - the pty has a notion of the actual terminal size and might send extra control chars like '\r' when the end is reached, if the terminal decides to add chars on own behalf for wotever reason this will fail badly. This not a voting thing, its more about being technically feasible."

I think you misunderstand, at least how DomTerm does it. The application explicitly requests where to put the fold buttons. When it comes to input lines, a properly-craft prompt string includes space for the prompt button, so the readline (or similar) library can calculate the correct spacing, without knowing that it's a hide button - it's just a random Unicode character. (This assumes that the prompt string syntax has a way to specify non-printing characters, of course.) When it comes to folding of output, ICANON is irrelevant.

You could even use a double-width Unicode character for the fold button in a prompt string, as long the input-editing library uses a suitable wcwidth implementation. In this case you need to make sure both hide and show character are the same width, but only include one of them in the "printing" part of the prompt string. (Or you can cheat: put two dummy single-column characters in the "printing" of the prompt, and override them with escape codes or a styling option.)

@jerch
Copy link
Member

jerch commented Jan 5, 2019

Yes, my bad ICANON will not affect this, its libs like libreadline that are affected by this for their line calculations. And the space trick would make the needed room. Thanks for clarification.
Now what happens if that fold sign room ends up in one if the last line cells - how does the autoindentation deal with this?

@PerBothner
Copy link
Contributor

"Now what happens if that fold sign room ends up in one if the last line cells - how does the autoindentation deal with this?"

Not sure what the concern is. The fold sign is just like any other printing character (perhaps styled differently) until you click on it. At which point it flips between the show/hide versions - and may show/hide subsequent text, which will force a re-flow for the affected lines.

There is no "autoindentation" per se. There are markers in the buffer, placed there by explicit escape sequences. The line-breaking algorithm may add or remove indentation (whitespace or special characters, as requested), depending on text folding and line width.

The line-breaking algorithm is a bit complicated (partly for performance reasons), so there could of course be bugs or under-specified corner cases.

@PerBothner
Copy link
Contributor

In case it isn't clear: The demo isn't a pure text folding demo. It also makes use of an orthogonal feature: pretty-printing, which allows the application to specify line-breaking and indentation based on logical (application-specified) structure. All the auto-indentation is part of the pretty-printing feature set. While text folding is conceptually distinct from pretty-printing, they are of course designed to work well together.

@sedwards2009
Copy link

I'm also the developer of a terminal emulator. It's goals are to extend and modernise the terminal environment with the kinds of features that albertz is talking about while at the same time remaining compatible with the applications in the terminal ecosystem.

Extraterm implements the "capturing command output" use case with its own shell integration which works via the shell prompts and/or pre-exec and post-exec hooks in the shell, (fish, bash and zsh). Command output is shown in a "frame" and separated from the surrounding text. You can also perform actions on the whole frame of output. Making it possible to fold it is on my TODO list.

I find this discussion quite interesting because I see the value in terminal applications being able to mark (nested) logical sections in their output as a hint to the terminal emulator.

Some quick thoughts:

  • Showing/hiding blocks of text should be done purely on the terminal side. If you need the remote application to be running to serve blocks of text via some protocol then you might as well just write an outline viewer using ncurses and implement all of the folding on the remote side.

  • This feature should not modify the terminal grid model by introducing extra characters or prompts or buttons which appear inside the character grid. This just makes it more complicated than it needs to be.

  • Although it is smart to consider how a terminal may implement the UI for which feature, any kind of spec should concentrate on semantics and not on whether an arrow is shown in the left side of the window.

  • It must be possible for the chosen escape sequence(s) to be ignored by terminals which don't understand it. The result is then the whole text in expanded form (i.e. nothing folded/hidden).

  • I find nesting or different levels to be useful. It is useful in text documents (i.e. heading level 1, heading level 2, heading level 3 etc), and I think it makes sense in other text output like software build tools.

  • Some way of marking a section as folded/hidden by default may be useful too. For example, the logging output of an application may hide log lines at info and debug severity levels by default.

I can see this feature working on a protocol level in a similar fashion as markdown headings. Each escape sequence marks the start of a level with a specified "depth". For example, imagine that the angle brackets represent the escape sequence in the VT stream:

This is text as the default level 0. This is text as the default level 0. This is text as the default level 0.

<level 1>The title of the level 1 block goes here
Text inside a level 1 block. Text inside a level 1 block. Text inside a level 1 block. 

<level 2>The title of the level 2 block goes here
Text inside a level 2 block. Text inside a level 2 block. Text inside a level 2 block. 

<level 2>Another block of text at level 2 with a title.
Text inside a level 2 block. Text inside a level 2 block. Text inside a level 2 block. 

<level 0>Back to the default level.

@PerBothner
Copy link
Contributor

@sedwards2009: "This feature should not modify the terminal grid model by introducing extra characters or prompts or buttons which appear inside the character grid. This just makes it more complicated than it needs to be."

I disagree:

  • If buttons cannot be in the character grid, that means the terminal must allocate space for a "gutter".
  • It may be desirable to have fold buttons indented from the left column, like in the JavaScript consoles for Chrome or Firefox.
  • It may be desirable to fold sections of text that are not complete lines. For example you might print a nested list that fits all on one line when sections are hidden, but will require multiple lines when everything is visible. In that case it is better to have the fold buttons after some initial text. Try my demo and adjust the window to both very wide and very narrow.

@PerBothner
Copy link
Contributor

"What I believe could be a better approach, in multiple aspects, is to invent escape sequences that add semantics to the data stream."

DomTerm has escape sequences to delimit prompt, input and error text. (Output is presumed to be the rest.) See the CSI 12u, 11u, 14u, 13u, 18u, 15u in the linked-to documentation.

There are escape escape sequences to delimit command groups, which may be nested. See the osc 119, 120, 121 commands. See here for suggestions how to use these escapes. The basic usage is that each prompt specifies a 119 escape to indicate start of new group, and implicitly the end of an existing group with the same id. To support nested command groups, specify a group-id. This could be a process id for the shell. Optionally send enter/exit group commands when you start/end a shell, but just using the 119 escape with group-id seems to handle nesting fairly well, without explicit enter/exit commands.

It would be great if we could standardize these or similar commands.

@sedwards2009
Copy link

Extraterm also has some custom escape sequences too. I've designed the sequences with a degree of security paranoia in mind. Each sequence contains a cryptographicly secure random "cookie" parameter. Each terminal tab in Extraterm requires a different cookie before it will accept the escape sequence. The cookie is available via an environment variable.

A remote application can use the escape sequences because it can read the cookie from the environment variable. But using cat to show the contents of a log file, for example, is still safe because it can't use or trigger the escape sequences. The log file contents won't have the cookie.

We trust applications, but we distrust any "data" which has found its way into the VT stream.

@egmontkob
Copy link

with a degree of security paranoia in mind

This sounds like pure and pointless paranoia to me. There are tons of other escape sequences that can occur in a text file and can mess with your terminal in various ways, e.g. trigger some responses as if they were typed from the keyboard, wipe the scrollback buffer, switch to weird character encoding, switch to invisible letters, invisible cursor and an unreadable color palette etc. These can all mess up the terminal, but are otherwise harmless. Incorrectly defining semantical blocks, e.g. the location of the prompt, isn't any more serious than these. (Some others, not implemented by many terminal emulators due to concerns, can even do more risky things like initiate a resize or move of the terminal window, set the clipboard contents etc.) It's a user error to cat (and not cat -v) an untrusted file and expect it never to mess up the terminal. In the mean time, if it's a log file, I'd argue that it's also an error in the producer of the log file if it can contain escape sequences; a log file shouldn't be untrusted data.

@sedwards2009
Copy link

Yes, I am aware that there are heaps of ways of messing up a terminal and making it unusable. I'm not really worried about things getting "messed up". I'm concerned about more serious security problems which could creep in as I add more features and escape sequences.

It's a user error

Allowing data dumped straight to the terminal to have access to potentially dangerous escape sequences is stupid when it can be avoided. There is no up-side to allowing this. There is no use case here. Building traps for the user and then blaming them when it goes wrong by calling it "a user error", is a horrid approach to security.

Err on the side of security first, and not the other way round is my advice.

@egmontkob
Copy link

There is no up-side to allowing this.

To make them work seamlessly across ssh, across su/sudo, inside detached and reattached screen/tmux. To make them recordable and replayable using script/scriptreplay. To make them redirectable to another terminal...

Security is important, but if it unconditionally triumps everything then I doubt you can end up with anything usable. You can't password-protect every command that let's say changes the color, or defines a foldable section, or whatever.

New escape sequences, new features should be added with care, and if in doubt with its security or privacy implications then rejected.

My advice is to stick to existing practice, treat security with common sense and not confuse it with paranoia.

Anyway, it's getting pretty off-topic...

@PerBothner
Copy link
Contributor

@egmontkob: "While the demo is indeed cool, it is equally useful? Will users actually bother to click on those arrows to fold/unfold, will it save them a noticeable amount of time?"

Well, this feature appears to be common and used in debuggers - at least JavaScript debuggers. A goal for DomTerm is not just a solid terminal emulators, but also a toolkit for REPLs. Like some of the kinds of things people use Jupyter for. Output with images, mathematics, rich text. I'd live to be able to seamlessly switch between that kind of "REPL toolkit" and xterm-compatible terminal, and mix-and-match their features.

"I consider the solutions proposed so far for [foldable tree data structures] to be too invasive and complicated."

I disagree - but it may be too complicated and esoteric to attempt to standardize it at this point.

However, it seems worthwhile to standardize a way to delimit prompts, inputs, and commands (possibly with nesting), in a way that can be put in a prompt string. Some terminals add an indication in a gutter area. Other terminals could add appropriate actions context (right-click) menu when the mouse is in the prompt area, such as folding

@jerch
Copy link
Member

jerch commented Jan 9, 2019

Yeah the semantic idea looks most promising to me, it would make it possible to deal with a bunch of purposes at once, yet let the emulators deal with the "how", which could range from ignore to present the data in a super special fancy way. Also competition in this field is good as emulators might try different solutions.
Btw we already have some semantic escape sequences in OSC like the title thing, I would also account egmont's URL proposal in this field. This might already give some hints on how to layout things on escape sequence level, discussing it here might be beyond this thread.

About security (slightly offtopic) - I dont see a higher security risk per se from a "semantic terminal", as it still works on the common unix rights system/privilegies and as long as it is only dealing with data representation. Problems might arise though by tricking the user to do unwanted things (see our first and hopefully last CVE, or search for malicious escape sequences). For semantic additions, that deal with hiding content (like folding), this might be a problem:

#> echo "Hi <hide>" && maliciousCommandB && echo "</hide> there!"

If <hide> is an OSC sequence the casual terminal user can hardly tell anymore, if there is something fishy going on. Are we entering this world of security issues with it? Imho we already have this problem, many webpages propagate their super-dooper cmdline toolset via:

#> wget .... | bash
or really frightening:
#> wget ... | sudo bash

and no one really inspects that downloaded stuff beforehand, lol.

Edit: Btw the URL proposal has the same tricking peeps issue, but it does not harm anyone until the URL might get called (leaving the data representation only field). Thus a semantic addition that carries actions beyond the visual things might need extra security countermeasures.

@egmontkob
Copy link

#> echo "Hi " && maliciousCommandB && echo " there!"

I don't think a semantical addition should be able to instruct the terminal to hide the contents. A semantical addition could inform the terminal: this is a prompt, this is the command entered after the prompt, etc. IMO they should all show up by default as before.

If the terminal offers a way to collapse a utility's output, and the user does so, there are still plenty of possibilities. Highlighting could copy the shown parts only. The emulator could auto-show a block if the contents within changes. Terminals could present a popup if the user copies hidden text. And so on. It's of course wise to think about these, and if we have a proposal, document the possible security issues we can foresee with various UI ideas that terminals might implement based on these sequences.

wget .... | bash [...] no one really inspects that downloaded stuff beforehand, lol

Same goes for when you download or git clone a repository and compile/install/run it.

URL proposal [...] does not harm anyone until the URL might get called

In the comment section of that proposal I argue that they shouldn't even cause harm if the URL is clicked, but IMO let's not derail this thread, this should be discussed over there if there are any remaining concerns.

@jerch
Copy link
Member

jerch commented Jan 9, 2019

I don't think a semantical addition should be able to instruct the terminal to hide the contents

Yeah its orthogonal to the semantic thing, it still belongs to the folding idea this thread was started with.

Should we move the semantic discussion to another thread?

@albertz
Copy link
Author

albertz commented Jan 9, 2019

Independent whether this is just about semantics or specific about folding: keep also in mind that there are both use cases where you want to hide the text by default (see eg Travis, or my better exchook use case), or where you want to show it by default (eg any command output). You could have two separate escape codes for both cases.

@sedwards2009
Copy link

@egmontkob

Security

I think you are overestimating the impact of the scheme I described. I use this daily in Extraterm for a number of features and it works across sudo and ssh.

But won't work through screen/tmux without changes to those apps, but that applies to almost every new escape sequence we could come up with.

I grant you that most escape sequence additions are likely to be benign and not require this kind of security approach. But at the same time I've already got some extra sequences in my terminal project which I wouldn't feel remotely comfortable having available to untrusted data.

@sedwards2009
Copy link

@PerBothner

REPLs... Output with images, mathematics, rich text

I'm also interested in these use cases. The approach I'm leaning towards is allowing application to send data in common formats (jpg, png, svg, etc) and display them in between VT output, effectively stopping the grid, inserting the image/whatever, and then creating a new empty grid with position (1,1) immediately under the outputted image.

@sedwards2009
Copy link

Should we move the semantic discussion to another thread?

Yes please. Some kind of general mechanism for associating data and metadata (URLs, file paths, hostnames, git branches, etc) with things in the VT stream could be very useful. It would also be a huge discussion too.

@egmontkob
Copy link

egmontkob commented Jan 9, 2019

@albertz

where you want to hide the text by default

I'd argue that this goes beyond the scope of defining semantics into defining the desired behavior, and as per my previous comments, I'd rather see this responsibility and functionality remaining at custom applications and not being pushed to terminal emulators.

@PerBothner
Copy link
Contributor

@egmontkob : "I'd rather see [data structure folding] remaining at custom applications and not being pushed to terminal emulators."

A custom application running in a terminal using an ncurses-style library can't fold text except within the visible (non-scrolled) part of the output. I don't believe there is any xterm escape sequences to manage scrolling or navigate above the "home" line. Of course an application can repaint scrolled-out output, but it's not integrated with the scrollbar. It might be interesting to design a protocol to deal with scrollbars and large scroll regions, but I haven't seen anything like that.

@egmontkob : "The terminal emulator is not a graphical UI toolkit, not something that offers widgets like a treeview, and IMO it shouldn't. This is the wrong level to solve this issue, the right level would be a dedicated application using ncurses or whatever similar (or a graphical app)."

One way to use DomTerm is as a high-level GUI toolkit for writing rich REPL consoles. You call the toolkit using escape sequences rather than procedure calls. This is much easier and powerful that using a ncurses-style library (if you're implementing a rich REPL console, not necessarily for oher things you might use ncurses for). Plus you have the same programing interface for rich and plain terminals, just with a downgraded UI for the latter.

@PerBothner
Copy link
Contributor

@sedwards2009: "The approach I'm leaning towards is allowing application to send data in common formats (jpg, png, svg, etc) and display them in between VT output, effectively stopping the grid, inserting the image/whatever, and then creating a new empty grid with position (1,1) immediately under the outputted image."

I think this is a good model for enhanced terminal emulators. For xterm.js and other terminal emulators based on JavaScript I would allow for "image/whatever" arbitrary (sanitized) HTML in a <div> element.
I think this is a way to get 90% of DomTerm's functionality, while keeping xterm.js's performance. Conceptually, you'd have two kinds of lines: character-grid lines, and non-grid non-character-editable variable-height "rich" lines in a <div>.

Non-JavaScript terminals could support a subset of the functionality by only allowing images, but not general HTML.

While a <div> section would not be editable with normal escape sequences, you can still support editable sections if they're marked with an id attribute, using an escape sequence to replace the marked section (by assigning innerHtml).

Folding text and fancy pretty-printing can be restricted to the rich-text sections.

@albertz
Copy link
Author

albertz commented Jan 10, 2019

@albertz

where you want to hide the text by default

I'd argue that this goes beyond the scope of defining semantics into defining the desired behavior, and as per my previous comments, I'd rather see this responsibility and functionality remaining at custom applications and not being pushed to terminal emulators.

Well, this is also a semantic distinction:

  • In the one case, it is kind of additional information for some context, which is maybe not relevant when just glancing over it. Again, see the both examples I gave:
    • Travis automatically folds everything away (or maybe only if there is no error, not exactly sure). Because usually, the output is not relevant.
    • In Python better exchook, the local variables should be folded away by default, as they are also additional information, which is often not relevant - only sometimes, you might want to see them. (Similar in any debugger: You will only see local variables of a selected frame, not of all frames together.)
    • I can imagine basically any tool, which outputs some logging on stdout, where the tool would output maybe additional extra information which is not so relevant. Or e.g. that is currently what the verbosity option is for in many tools. Which is a ugly workaround because there is no such thing as folding in terminals. Otherwise it could just print everything, and folds away by default the higher verbose details.
  • In the other case, maybe the output is relevant initially, but then after seeing it you might want to fold it away, for an better overview. Actually this is a use case which is less important for me. But when you look at what other people wrote here, I think some of them assume this to be the default behavior. This behavior would not make much sense for my use cases. And this is because there is a semantic difference.

@egmontkob
Copy link

Of course an application can repaint scrolled-out output, but it's not integrated with the scrollbar.

But there's nothing new in it. That's the experience you get in text viewers and editors, terminal multiplexers etc. as well.

It might be interesting to design a protocol to deal with scrollbars and large scroll regions, but I haven't seen anything like that.

I love this idea, would really love to participate in the design and implement it in VTE! The way I currently imagine it is that on the normal screen the scrollbar would keep working as it does now, whereas on the alternate screen newly designed two-way escape sequences could set the scrollbar size and location from the application, and could report back scrollbar positioning events. There could also be an option to select whether mouse wheel and Shift+PgUp-like events send the raw mouse/keyboard event or a scrolling event.

One way to use DomTerm is as a high-level GUI toolkit for writing rich REPL consoles.

I'm trying to find a way to express my feelings without repeating myself too much. I haven't tried DomTerm, I only get an impression here about what it can do.

I'm not convinced REPL consoles, the need for toggleable blocks are generally such a frequent, basic use case than the way you (your workflow) seems to revolve around it. I think it's just one pattern out of plenty that an application might want to have. I might easily be wrong here, or not up-to-date with current trends.

If DomTerm finds these important then great, by no means do I want to discourage it from doing so. But I find the features too arbitrary, too specific, and immature to push towards a cross-terminal emulator adaption. DomTerm introduces new paradigms, it's – as you said – basically a mixture of a terminal emulator and a GUI toolkit. It's not a direction I'm keen on at the moment, and IMHO not something other terminals should reasonably be expected to eagerly follow.

I'm also somewhat concerned about DomTerm doing it "all in one" or "all at once", switching to something brand new due to multiple reasons, without addressing them (and their potential other use cases) one by one. E.g. there's the scrollbar issue, one of the reasons why you decided to push the feature from an ncurses app to the terminal emulator. But let's stop for a moment, look around for other use cases. As I said, text editors also suffer from the scrollbar problem. If we do anything, shouldn't it address them, too? Should we go into the direction where a text editor can "cat" the entire file, and then instruct the terminal to scroll back, change the data in the scrolled back region so that you get real file editing experience combined with the actuall scrollbar? Probably not. Although it feels to me that this approach would be in a somewhat similar direction than yours: pushing some of the app's intended features from the app to the terminal because an app can't do it well enough. Let's instead perhaps address the core problem. Why can't an app implement it well enough? Because of poor scrolling? Let's address scrolling, then. If we do so, not only ncurses-based REPL tools benefit from it but also text viewers and editors, that is, a much larger user base. Plus, you'd no longer need to push the treeview toolkit to the terminal emulator, it could be done in ncurses. Sounds like a much cleaner, simpler, easier to implement, likely to get widely adopted approach to me. (Note: by conveniently and sloppily saying "ncurses" I meant any app that switches to the alternate screen, not just ncurses-based ones.)

Based on how the adoption of way simpler features go across terminal emulators (typically: quite slowly, and with quite some amount of resistance), based on seeing how much developer resources there are to improve terminal emulators (typically: very little), I find such big changes, such changes in the essentials pretty unlikely to succeed. I'd much rather see a series of small and independent changes, such as adding semantical markers for the prompt, or being able to use the scrollbar on the alternate screen.

@PerBothner
Copy link
Contributor

@egmontkob: "I love a [protocol to deal with scrollbars and large scroll regions], would really love to participate in the design and implement it in VTE! The way I currently imagine it is that on the normal screen the scrollbar would keep working as it does now, whereas on the alternate screen newly designed two-way escape sequences could set the scrollbar size and location from the application, and could report back scrollbar positioning events."

It would be nice if such a protocol could work with sub-windows of the terminal. Consider running Emacs with multiple sub-windows. You don't want a scrollbar for the Emacs session as a whole, but you want the option of adding scrollbars to some (but not all) of the sub-windows. The application could send a request to "split the current sub-window into two side-by-side sub-windows in the ratio 60%+40%, which a scroll bar on the left sub-window only"; the terminal would send back a reply specifying the number of available columns and rows for each sub-window.

More flexibly, you can allow overlapping sub-windows, but then you have to figure out how to bring a sub-window to the top. Better to stick with non-overlapping (tiling) sub-windows, at least to start. That is enough for Emacs, at least.

Adjusting a scrollbar from an application would be fairly simple: a command that specifies a (sub-window) number, scroll top (as a fraction) and scroll visible amount as a fraction. It would be nice for terminal to remember a region that has scrolled out of view, so it can display if it scrolls back into view without waiting for the application to repaint.

When the user adjusts the scroll bar that would use a protocol similar to mouse events. It could wait for the application to update the display, but that might be a bit laggy. Better to do the scroll locally and tentatively without waiting for the application response..

@egmontkob
Copy link

egmontkob commented Jan 11, 2019

I was afraid this was going in this direction :(

As stated in my preceding comments, I don't want to turn terminal emulators into graphical toolkits where apps can place pretty much arbitrary elements (treeviews, scrollbars...) anywhere.

Sure, a single scrollbar is not ideal when you have two sub-windows vertically. But what if you have them next to each other? (Not sure if emacs can do it; if not then another editor can sure do.) Then you'd need to have a scrollbar in the middle of the window. What if they want a horizontal scrollbar, too? UPDATE: I just read it carefully that you were actually talking about the side-by-side split case, and not when the windows are above each other. This is exactly the terrible complexity (including how this plays along nicely with geometry hints etc.) that I'm firmly against going into.

We already have amazing infrastructures out there on top of which apps can do anything they want to. It's the graphical systems and toolkits. There's no point in turning terminal emulators (let alone, many of them in parallel) into yet another of these, and I doubt we have resources to make such an attempt even remotely successful.

I don't want to do anything more than just hooking up to the single scrollbar that we already have now but not use on the alternate screen. (Emacs could still make this scrollbar reflect the currently focused sub-window.)

scroll top (as a fraction) and scroll visible amount as a fraction.

Nitpicking, but I'm against using fractions, and not just because the CSI and friends escape sequences don't allow dots. We should use integers, even if it takes one more of them (numerators: visible height, scrolling offset; denominator: document height – in lines).

It could wait for the application to update the display, but that might be a bit laggy.

There's no way you could ever reliably connect the scrollbar dragging event to the display update counterpart.

Better to do the scroll locally and tentatively without waiting for the application response..

Yep.

It would be nice for terminal to remember a region that has scrolled out of view, so it can display if it scrolls back into view without waiting for the application to repaint.

Nice idea! I don't think it influences the specs in any way (or maybe an app should be able to turn this behavior on/off), but it should be optional to implement it by the emulator.

Another thing we have to watch out for: a race condition if the user drags the scrollbar at pretty much the exact time an application also wants to update the scrollbar position (e.g. "less" follows a file's growth). Then the two events cross each other and both parties believe that the event of the other one is the newer. I think the proper solution is to require exactly one side (to be decided whether the terminal or the app) to echo back all the events, that is, reinforce the said position.

@PerBothner
Copy link
Contributor

"We already have amazing infrastructures out there on top of which apps can do anything they want to. It's the graphical systems and toolkits. There's no point in turning terminal emulators (let alone, many of them in parallel) into yet another of these,"

It's not so much turning terminal emulators into graphics toolkit, as defining a wire byte protocol for communicating between an application and a terminal emulator that makes uses of a toolkit. For example gnome-terminal implemented using Gnome/Gtk would use the latter to create sub-panes and scrollbars. Or JavaScript-based terminals use the available HTML elements. The idea is something similar to curses - but with a richer UI. And that is not something that is readily available - and I think it would be useful.

"I don't want to do anything more than just hooking up to the single scrollbar that we already have now but not use on the alternate screen."

Sure, that's a useful start, but I don't know of an application that would really benefit. I can think of multiple applications (emacs and screen/tmux, to start) that could really benefit from scrollable sub-windows.

(Actually, DomTerm makes (minor) use of the scrollbar on the alternate screen, since you can scroll up to the main screen. The alternate screen is considered to be appended to the end, after the main screen.)

"(Emacs could still make this scrollbar reflect the currently focused sub-window.)"

That could be a somewhat unusual and slightly ugly UI, since the scrollbar height would be the entire height, but the scrollable region would be just the main text area (or subpane). On the other hand, it would save some screen real estate, and might be easier to implement in both terminals and applications. Probably worth starting with that, at least.

However, do note that local repaint would seem to require the terminal to know what region to repaint (and what to save in the first place). Xterm does have commands to set a rectangular region - one could perhaps use those.

"Nitpicking, but I'm against using fractions"

I didn't mean fractions specifically, but rather some way to specify a number between 0 and 1. Maybe an integer interpreted as per-mille (like percentage but divide by 1000 rather than 100). Per-mille should be enough resolution.

@jerch
Copy link
Member

jerch commented Jan 11, 2019

Seems there are many ideas how to improve terminal experience. Its important to have this discussion, still I think the platform @egmontkob was talking about might be a better place to collect ideas, discuss pros and cons of a particular idea and get something rolling. For my taste this thread is in a brainstorming state that is unlikely to get us anywhere.

@PerBothner
Copy link
Contributor

Off-topic, but related: I have a DomTerm feature request PerBothner/DomTerm#61 for hover text. I just wrote a revised design, based on some input in the current discussion. If you have an suggestion about the design, please leave comments there. (Unless someone has a better place to discuss escape sequences and extensions.)

@felixfbecker
Copy link

I would love to be able to wrap stack traces from errors in a collapsible range so that by default only the error message is shown (and therefor not pushed out of the view by the long stack trace) and then the stack trace can be uncollapsed. This would be amazing to have in the VS Code terminal.

@Tyriar
Copy link
Member

Tyriar commented Oct 7, 2019

Closing in favor of discussions in terminal-wg, we wouldn't want to pursue something like this until it was relatively standard imo.

@Tyriar Tyriar closed this as completed Oct 7, 2019
@ssbarnea
Copy link

ssbarnea commented Oct 1, 2020

@TylerJewell Can you please include a link to the terminal-wg thread you mentioned? I was unable to identify its location.

@Tyriar
Copy link
Member

Tyriar commented Nov 4, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/proposal A proposal that needs some discussion before proceeding
Projects
None yet
Development

No branches or pull requests

8 participants