Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support rendering of tab characters in terminal #734

Closed
Tyriar opened this issue Jun 27, 2017 · 43 comments
Closed

Support rendering of tab characters in terminal #734

Tyriar opened this issue Jun 27, 2017 · 43 comments
Labels
help wanted type/enhancement Features or improvements to existing features

Comments

@Tyriar
Copy link
Member

Tyriar commented Jun 27, 2017

Downstream: microsoft/vscode#29523

node -e "console.log('a\tb')" should output an actual tab. This means it's considered a single character by the selection logic and that it's copied as a \t char not several spaces. This probably won't be too difficult to deal with by leveraging and improving upon the double width char code.

This is supported by gnome-terminal, Terminal.app and iTerm2.

@egmontkob
Copy link

It's pretty hackish in gnome-terminal (and reportedly in iterm2 too).

The thing is, TAB is a control character that's supposed to move the cursor, but preserve the characters (including their attributes such as background color) it jumps through. So you can only do this smart copy-pasteable TAB thingy if it's printed after the current end of the line (which is good enough in practice for most cases).

You might find this (and stuff linked from there) useful: https://gitlab.com/gnachman/iterm2/issues/6193.

@Tyriar Tyriar added type/enhancement Features or improvements to existing features and removed type/feature labels Apr 4, 2018
@Tyriar
Copy link
Member Author

Tyriar commented Jun 3, 2018

Closing as designed as per above comment. I don't really want to do the copy \t chars anymore as it's a bit of a hack anyway.

@Tyriar Tyriar closed this as completed Jun 3, 2018
@hansvonn
Copy link

What is the resolution on this? There was a downstream issue related to this that was also closed. I am interested in this functionality as I imagine others are as well.

@Tyriar
Copy link
Member Author

Tyriar commented Jan 26, 2019

@hansvonn it didn't really seem like it made sense to do given how \t works behind the scenes.

@hansvonn
Copy link

@Tyriar It depends on what perspective you come from. At least for me I expect tabs to be copied when that was what I pushed to the console. I often do this when I want to generate tab delimited data using powershell. As it is now I use regex to replace spaces with tabs which is just another step.

I can understand where you are coming from in how consoles used to function. I can also understand how many other things you all have to work on. I appreciate all the good work!

@daiyam
Copy link
Contributor

daiyam commented Apr 5, 2019

+1

@daiyam
Copy link
Contributor

daiyam commented Aug 21, 2021

I'm finding this issue annoying when printing (cat) a file containing tab characters. I will try to make a PR to fix that issue.

@Tyriar
Copy link
Member Author

Tyriar commented Aug 23, 2021

@daiyam I don't think we've found a nice way to add support this so we probably wouldn't accept a PR.

@lxomb
Copy link

lxomb commented May 31, 2022

This is extremely inconvenience. I need to frequently copy data with tabs from VSCode terminal, which is the only way I get such data due to some limitations. Such replacement is unacceptably destroying productivity.

@Tyriar
Copy link
Member Author

Tyriar commented May 31, 2022

@limbxo fyi you can pretty easily map spaces back to tabs by selecting 4 spaces for example, hitting cmd/ctrl+shift+l to select all instances of the selection in the file, delete then tab.

@daiyam
Copy link
Contributor

daiyam commented May 31, 2022

Such replacement is unacceptably destroying productivity.

@Tyriar I think you missed that statement.

Also, why other terminals can do it but not xterm.js ?

I've never went in details why my PR #3434 was denied. But to this day, it's still working.

@Tyriar
Copy link
Member Author

Tyriar commented Jun 1, 2022

@daiyam I saw it, my point is it can take seconds to get the tabs back.

The PR was closed because it had too many changes, you solved most of that by putting out other PRs but it still has changes for 30 files. I think a more targeted fix that only changed how copy works would be safer and have less risk for regressions elsewhere. Looking at this a little I think a safer approach would be to not annotate the "tab filler" characters but instead add a tab hint flag for any cell containing a tab. SelectionService.selectionText could then use a new useTabs parameter in Buffer.translateBufferLineToString and BufferLine.translateToString.

@lxomb
Copy link

lxomb commented Jun 1, 2022 via email

@Tyriar
Copy link
Member Author

Tyriar commented Jun 1, 2022

@limbxo since you're using VS Code you could pipe whatever you're working with into an editor: cat README.md | code -

@brandonros
Copy link

Such replacement is unacceptably destroying productivity.

@Tyriar I think you missed that statement.

Also, why other terminals can do it but not xterm.js ?

I've never went in details why my PR #3434 was denied. But to this day, it's still working.

this got moved to its own repo fyi: https://github.com/daiyam/xterm-tab

given this comment:

The PR was closed because it had too many changes, you solved most of that by putting out other PRs but it still has changes for 30 files.

can we retry now a year later to resolve this? console.log(\t) to then copy the output to the clipboard and getting spaces in VS Code terminal but not the default Mac terminal is confusing, and no i don't think you can just replace 4 spaces with a tab on the output given the alignment problems from what i understand

https://github.com/xtermjs/xterm.js/compare/master...brandonros:xterm.js:tab?expand=1 i made this to see how far the two projects have drifted

@davidstone
Copy link

Closing as designed as per above comment. I don't really want to do the copy \t chars anymore as it's a bit of a hack anyway.

As a user, how it works behind the scenes isn't really relevant to me. If the feature makes sense (and I think "copy the characters my program output instead of other characters" makes sense), then if the implementation doesn't match up with the feature request the implementation is wrong, not the feature. It could be reasonable to close this as "Too hard to solve" or "Not worth the effort", but "Working as intended" feels like the wrong answer here.

@daiyam
Copy link
Contributor

daiyam commented Oct 29, 2023

I think "copy the characters my program output instead of other characters" makes sense

Yeah! A feature-blocker for me.

My fork is still working but since around 6 months, it's breaking some of my powerlines...

@jerch
Copy link
Member

jerch commented Oct 29, 2023

and I think "copy the characters my program output instead of other characters" makes sense

TAB is not a character, it is a control code meant to be "executed" by the terminal. Thats what xterm.js does "as intended".

... then if the implementation doesn't match up with the feature request the implementation is wrong, not the feature.

Let me translate that - if someone requests that a red traffic light should mean "go" instead of "stop" - the traffic lights are all wrong? Interesting logic, good luck with that.

@starball5
Copy link

If the semantics argument is that a tab is not a character but a control code, then should it not be that instead of outputting spaces, it creates an unselectable spacing element? I'm genuinely confused about the logic here. And why doesn't the same logic apply to line feeds?

Also, related on Stack Overflow: Printing tab (\t) characters in VS Code terminal.

@daiyam
Copy link
Contributor

daiyam commented Jan 31, 2024

In VSCode, I just wonder why the tab character has a different behaviour between the editor or the terminal.

  • editor: with "editor.insertSpaces": false, no spaces are inserted instead of the a character.
  • terminal: spaces are always inserted instead of a tab character.

fyi you can pretty easily map spaces back to tabs by selecting 4 spaces for example, hitting cmd/ctrl+shift+l to select all instances of the selection in the file, delete then tab.

No because we don't know if a 4 spaces block was a tab or 4 spaces in the original file.

@jerch
Copy link
Member

jerch commented Jan 31, 2024

If the semantics argument is that a tab is not a character but a control code, then should it not be that instead of outputting spaces, it creates an unselectable spacing element?

The behavior is pretty standardized across terminal emulators and the question how to represent empty space by SP - it dates back to old DEC behavior predating any of the fancy styling/annotation possibilities with HTML/CSS or even unicode. We cannot simply use other patterns here ignoring what 99% of all other TEs would do.

I'm genuinely confused about the logic here. And why doesn't the same logic apply to line feeds?

The reason, why TABs are hard to impossible to replay from the terminal state, is because of the quality of the behavior it controls, not simply the fact that it is a control char. A newline (NL) for example is also a control char (more precisely 2 control chars - CR + LF), still it can be replayed, as it has a direct representation on the terminal buffer (well LF has, CR is still lost).
And thats not the case for any purely cursor manipulation controls like CR, TAB or any of the cursor repositioning sequences. Those just move the "writing head" around on a 2d text grid. To preserve the fact, that the writing head was moved, would need a full control sequence recorder, which is typically not what anyone would want from copy&paste, as it needs a control sequence aware target on the paste side.

For TAB that "moving the writing head" creates edge cases, which makes a reasonable restore handling of TABs unreasonable - most notably any jumped over content cannot be serialized in a useful way anymore. To compare with what most editor do here, a terminal has another drawback - editors mostly run in insert mode moving existing content to the right, while terminals typically run in replace mode. Example:

given: a terminal row contains these characters, cursor at position with 'x'
['h', 'e', 'y', ' ', 't', 'h', 'e', 'r', 'e', '!', 'x']
input: '\r\tBob'
--> CR - cursor jumps to first cell (cannot be serialized)
--> TAB - cursor jumps to 4th cell (given tabwidth is 4)
--> PRINT('Bob') in replace mode
['h', 'e', 'y', ' ', 'B', 'o', 'b', 'r', 'e', '!']
now cursor is at position with 'r'

So the full input was 'hey there!\r\tBob' - How should that be serialized back and still contain the TAB char? Note that the final screen state now reads as 'hey Bobre!'. Its impossible, unless the target system fully supports the cursor mechs and is in replace mode.

@daiyam
Copy link
Contributor

daiyam commented Jan 31, 2024

The behavior is pretty standardized across terminal emulators and the question how to represent empty space - it dates back to old DEC behavior predating any of the fancy styling/annotation possibilities with HTML/CSS or even unicode. We cannot simply use other patterns here ignoring what 99% of all other TEs would do.

I have tested a bunch of terminals and many do support the tab character (like Terminal.app on macOS) at the start of a line.

I do agree that those characters are tricky and command like printf 'hey there!\r\tBob' are problematics.
Here what I currently get in VSCode (standard xterm.js)
Screen Shot 2024-01-31 at 14 58 17

What I would like is an option to avoid to replace the tab character so when copied, it's still a tab character.

@Tyriar
Copy link
Member Author

Tyriar commented Jan 31, 2024

We could indeed hack something together to guess if it's a tab, but personally I've been burned before with these designed to be flaky features such as alt+click to move the cursor which has resulted in many non-actionable issues that I need to triage/respond to/close, and will continue to until the feature is removed as I don't think it's possible to fix its flaws. I can foresee years of "tab character copied when copying in the terminal!!!" issues if we implemented this.

The safest option is to go with the well defined and consistent behavior of copying exactly what the terminal is told to render.

@daiyam
Copy link
Contributor

daiyam commented Jan 31, 2024

The safest option is to go with the well defined and consistent behavior of copying exactly what the terminal is told to render.

I agree! I would be happy to help.

@jerch
Copy link
Member

jerch commented Jan 31, 2024

@daiyam About your 'hey Bob%' in the shell - that happens, because the cursor is still at the 'r' of 'hey Bobre!' - to see whats actually in that row just append another LF like this: 'hey there!\r\tBob\n' to avoid overwriting by the shell prompt.

The safest option is to go with the well defined and consistent behavior of copying exactly what the terminal is told to render.

I agree! I would be happy to help.

Maybe I misunderstood - returning the actual render state is exactly what xterm.js does atm.

@daiyam
Copy link
Contributor

daiyam commented Jan 31, 2024

Maybe I misunderstood - returning the actual render state is exactly what xterm.js does atm.

Or maybe me!

For example, when rendering printf '\tBob', the \t is moving the cursor by 4. It isn't told to insert 4 spaces.
So I would say that xterm.js doesn't render what it's being told...

Edit: it's the title of the issue...

@jerch
Copy link
Member

jerch commented Jan 31, 2024

For example, when rendering printf '\tBob', the \t is moving the cursor by 4.

Thats not correct. TAB tells the terminal to move the cursor to the next tabstop, which might be one to max tabwidth cells away in printing direction (typically to the right).

It isn't told to insert 4 spaces.
So I would say that xterm.js doesn't render what it's being told...

Huh? xterm.js does not insert 4 spaces, it just moves the cursor to the next tabstop. Imho that rendering behavior is correct. You are mixing terminal render state with copy&paste row serialization here. To stick with the title of the ticket - rendering of a TAB makes no sense, as it is a cursor action with no visual representation. It can only be observed by actual content around it.
But we are now turning in cycles, not much to add to it.

@daiyam
Copy link
Contributor

daiyam commented Jan 31, 2024

xterm.js does not insert 4 spaces

So why do I get spaces when copied?

I don't remember the internals of xterm.js but, all I know is that \t are replaced by some space characters.

@Tyriar
Copy link
Member Author

Tyriar commented Jan 31, 2024

@daiyam because there's a space there, as opposed to an empty cell where we would not.

@jerch
Copy link
Member

jerch commented Jan 31, 2024

So why do I get spaces when copied?

Because thats the only reliable way to advance the cursor in any editor/textfield to preserve the visual state of the terminal output. (Edit: and nope thats not done in terminal rendering, but during content serialization of the terminal buffer.)
Btw outputting a TAB does not help much here, even for the easy case of empty jumped-over terminal cells. A TAB is not defined in its width, it is output system dependent. An editor or terminal might render that at different width (typically 2, 4 or 8 SP wide). Only direct transition to SP can preserve the terminal screen here.

@starball5
Copy link

@jerch Thanks for the example and elaboration. It was helpful to me to understand the reasoning here.

@Eric-Polin
Copy link

Tab is not a character ??? And zero is not a number, I suppose ???
You know, there are people working, here. We are not interested in philosophy, we just want an option to get rid of that absurd interference of VSCode with our output.

@jerch
Copy link
Member

jerch commented Jul 1, 2024

@Eric-Polin This is not a philosophical dispute, but based on practicability. TAB cannot be replayed in a reliable fashion, unless all paste targets fully implement cursor mechs with replace mode. You will figure that yourself by re-reading the comments above and putting some effort into comprehension.
If you rely on terminal scraping to get meaningful response from your app - "you are holding it wrong". The POSIX terminal/pipe layer knows a distinction between TTY and normal IO pipes for exactly that reason for more that 50ys. Use IO pipes instead.

@Eric-Polin
Copy link

Thank you for your diligent response.
POSIX dates back to 1988, actually, and I had been developing for 14 years already at the time.
Although standards are no excuse against common sense, there is some coherence indeed : cmd and Windows Terminal behave the same way (of course, to the contrary, xfce4-terminal and the like plainly render tabs as tabs). So, the current functionality that simulates that odd Windows behaviour within VSCode may be of interest when developing a command-line tool for Windows; not a very common case, obviously.
Now, for all other development and testing uses, what is expected is to be able to check the printed data, without being ensnared by some uninvited presentation layer that interferes with it. So, this behaviour should be available as well, to say the least.

@jerch
Copy link
Member

jerch commented Jul 1, 2024

POSIX dates back to 1988, actually, and I had been developing for 14 years already at the time.

If thats the case, then you might also know, that the TTY/pipe distinction in unixoid OSes is much older - from around early 70s. It does not matter at that point, when POSIX finally specced it into a reliable standard.

Windows is a totally different story. They better would not have dumped their old SFU extension (worked up to XP I think), which gave a much better POSIX compliant abstraction layer to windows than the patchwork we have today. Even openVMS (which is much closer to NT than any unixoid system) still gets that done.

Now, for all other development and testing uses, what is expected is to be able to check the printed data, without being ensnared by some uninvited presentation layer that interferes with it. So, this behaviour should be available as well, to say the least.

Then work with raw terminal data. It has all the control codes incl. TAB. But again - you gonna need a parsing lib, that ideally implements terminal mechs, atleast proper cursor sequences. (e.g. expect implements a vt100 clone imho...)

@daiyam
Copy link
Contributor

daiyam commented Jul 1, 2024

Huh? xterm.js does not insert 4 spaces, it just moves the cursor to the next tabstop.

So why do I get spaces when copied?

Because thats the only reliable way to advance the cursor in any editor/textfield to preserve the visual state of the terminal output. (Edit: and nope thats not done in terminal rendering, but during content serialization of the terminal buffer.)

Those are previous answers I got. In one, tabs aren't replaced by spaces. In the second one, they are replace by spaces. Look like there is an issue there.

What I would like is that when a tab is printed, I should able to get it back when copying the output. (Only when the tab isn't moving over existing characters)

@jerch
Copy link
Member

jerch commented Jul 1, 2024

Those are previous answers I got. In one, tabs aren't replaced by spaces. In the second one, they are replace by spaces. Look like there is an issue there.

As always - context matters. You successfully stripped the context in a way, that lets this look like a contradiction. Try again with the context included and you will see - there is no contradiction.

@daiyam
Copy link
Contributor

daiyam commented Jul 1, 2024

Can you at least agree that with printf 'tab=\t;\n', we can't get back the tab when copied? And that is an issue?

@jerch
Copy link
Member

jerch commented Jul 2, 2024

Can you at least agree that with printf 'tab=\t;\n', we can't get back the tab when copied?

Sure I can agree, because xterm.js wont give you the TAB, when you copy the terminal output.

And that is an issue?

Nope. If you want structured data from your app, dont scrape the terminal. The terminal output is a visual representation to make sense for the human eye. Scraping the terminal is like getting text from a photo copy - it would need heuristics to translate that human eye pleasing representation back to a structured data variant (like an OCR for text from a photo copy). To keep working with TABs included and see the terminal output at once, you can do things like:

$> printf 'tab=\t;\n' | tee >(python -c 'import sys; print(repr(sys.stdin.read()))')
tab=    ;
'tab=\t;\n'

where the first line is the "content report" by xterm.js, and the second line the content seen by your processor still containing TABs (python here).

Last but not least - most TEs (except vte-based and iterm2) work exactly as xterm.js in this regard. I dont think we should change that to a wonky "might-output-TABs-under-circumstanceA-but-not-under-B,C" state. It will only cause more confusion and followup issues with ppl complaining, why they dont get TABs for B & C, or even why they get TABs for A at all. Watering the terminal state further with wonkiness never was a good idea in the past and it most likely will not be in the future. And since there are processing alternatives, I dont even see a reason to keep this dicussion going.

@daiyam
Copy link
Contributor

daiyam commented Jul 2, 2024

Last but not least - most TEs (except vte-based and iterm2) work exactly as xterm.js in this regard.

Except:

  • Terminal.app
  • Kitty
  • Alacritty

Most Linux' TE are based on VTE...

@pauldraper
Copy link

It's too hard to do this.

Nevermind that every other terminal in the world preserves tabs.

It's just too hard/not a good idea.

@jerch
Copy link
Member

jerch commented Nov 19, 2024

Since we just entered the kindergarten level in this thread I gonna lock it.

If anyone has something substantial to add to the topic like an impl idea addressing the edgecases, feel free to open a discussion about it and maybe link to this topic. Plz try to restrain from adding derailing or dumb comments, as it helps no one to get anywhere. Thx.

@xtermjs xtermjs locked as spam and limited conversation to collaborators Nov 19, 2024
@Tyriar
Copy link
Member Author

Tyriar commented Nov 19, 2024

@jerch I was very close to doing the same 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
help wanted type/enhancement Features or improvements to existing features
Projects
None yet
Development

Successfully merging a pull request may close this issue.