Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra space in terminal when using surrogate pair with oh-my-posh #5174

Closed
sharky98 opened this issue Oct 2, 2024 · 9 comments
Closed

Extra space in terminal when using surrogate pair with oh-my-posh #5174

sharky98 opened this issue Oct 2, 2024 · 9 comments
Labels

Comments

@sharky98
Copy link

sharky98 commented Oct 2, 2024

When I use a surrogate pair (such as \udb80\udced for the nf-md-calendar icon), there is an extra space before the trailing diamond and after the content of the segment that does not get styled.

I opened an issue in both terminal where this happens, but it seems they both use xterm, so I open an upstream issue here.

See JanDeDobbeleer/oh-my-posh#5706 for original report.

Details

  • Browser and browser version: I don't know how it is used in Tabby and VSCode (my guess would be Electron?)
  • OS version: Windows 11
  • xterm.js version: I don't know which version is used in Tabby (1.0.215) and VSCode (1.93.1).

Steps to reproduce

With oh-my-posh, use a surrogate pair in any segment such as below.

{
  "blocks": [
      "segments": [
        {
          "background": "#234d70",
          "foreground": "#d6deeb",
          "leading_diamond": "<transparent,background>\ue0b0</>",
          "properties": {
            "time_format": "2006-01-02"
          },
          "style": "diamond",
          "template": " \udb80\udced {{ .CurrentDate | date .Format }} ",
          "trailing_diamond": "<background,transparent>\ue0b0</>",
          "type": "time"
        },
    ]
  ]
}
@jerch
Copy link
Member

jerch commented Oct 2, 2024

@sharky98 Thats a codepoints from the PUA regions in unicode - they are unspecced by unicode on purpose, as they are meant to be application dependent. Therefore terminal emulators dont know, how wide such a char has to be rendered (and often just guess between 1 or 2 cells wide). Ideally we'd have an API to declare properties of PUA chars inkl. their widths, so apps could just load their stuff on-the-fly. But such an interface does not exist currently on any terminal emulator. Thats the theory part.

Now to your char at hand:

Your UTF-16 surrogate \udb80\udced translates to \U000F00ED in UTF-32, which gets a width of 1 cell in all of our renderers and all of our unicode addons:
image
(I dont have any powerline font installed, thus the placeholder question mark symbol...)

Thats awkward, because your issue suggests, that xtermjs would render it 2 cells wide (thus creating an empty cell behind).
Can you create a screenshot of echo -e '#\U000F00ED#' and upload it here?

@sharky98
Copy link
Author

sharky98 commented Oct 4, 2024

@jerch Thanks for the quick answer. I just tried it in VSCode terminal. It seems to render it at 1 cell wide (although the # symbol seems to overlap a bit, but that part I attribute to the specific font I use: FiraMono Nerd Font, switching to the "Mono" variant remove that overlap, but not the extra space).

I added a second segment using another symbol that is using a single code \uf455 to see the difference.

result of echo -e '#\U000F00ED#'

VSCode
image

Windows Terminal
image

Other fonts
I also tried other Nerd Font family, all have the extra space. In fact, even using Consolas there is the extra space.

image

The weird thing is that the extra space is not right after the symbol, but at the end of the string where it is used. To test this I used #, ~ and % at the end and start of each string (alternating for each consecutive strings). I also added the surrogate pair in a leading diamond to verify that the behaviour is constant when using oh-my-posh.

Demonstrate extra space after end of string
The space between symbol and date is expected from my template.
image

And to be thorough, I removed oh-my-posh and modified PS1 in my .bashrc to the following. The result is that using \U000F00ED create an extra space after the string (by seeing that the text written after the > has a space, but not when using \Uf455.

PS1=$'~Before~#\U000F00ED#~After~'
PS1+=$'%Another%>'

image

PS1=$'~Before~#\Uf455#~After~'
PS1+=$'%Another%>'

image

While using Windows Terminal, in both cases there is not extra space after the >.
image

@sharky98
Copy link
Author

sharky98 commented Oct 4, 2024

For completeness, I also tried with a character that is not from PUA regions, but still needs surrogate pair. I used \U00010631 (pair \ud801\ude31). And the extra space is there also.

image

@jerch
Copy link
Member

jerch commented Oct 5, 2024

@sharky98 Your output for your PS1 examples looks really broken and I am pretty sure, that it is not xtermjs' fault (cannot repro here). As you correctly noted, if xtermjs would apply wrong cell widths to the PUA char, the empty cell would have been right behind that char.
This looks more, as if you get a broken string from the PTY. Is this with ConPty by any chance? Because ConPty used to have weird unicode issues in the past (windows terminal delivers its own newer ConPty version).

Can you get the byte sequence sent from the PTY master to the terminal somehow? (Idk if thats possible in vscode, whether it allows to enable terminal debug logging...)

@sharky98
Copy link
Author

sharky98 commented Oct 5, 2024

Thanks again for the quick feedback. This is getting a bit too advanced for my terminal knowledge 😅. I'll see what I can do.

Also, to help me figure it out, when you say my PS1 is broken, what do you mean? Because other than overwriting the PS1 variable to test a surrogate pair, this is all default Ubuntu 24.04 bash config 🤔.

Do you have/know an app that use the most "strip down" xterm to test this other than using VSCode? It does happen also in Tabby, but only for pwsh, not when using SSH sessions, however I have no idea what they changed between both session types.

@jerch
Copy link
Member

jerch commented Oct 5, 2024

Most stripped down version of xtermjs is prolly to run our demo. You can get this running by these steps:

  • clone this repo
  • cd xterm.js
  • yarn
  • yarn setup
  • yarn esbuild
  • yarn esbuild-demo
  • yarn start
  • point your browser to localhost:3000

On the demo page you get tons of terminal settings to play with, the important part is to set logLevel to debug, which will log the bytes being sent to the terminal in the browser console.

Also, to help me figure it out, when you say my PS1 is broken, what do you mean?

No your PS1 strings look fine, the resulting output in screenshot looks all wrong to me. For comparison, thats what I get:
image

If you manage to get the demo running, I'd need the bytes sent to the terminal for that particular prompt, which looks like this for me:
image

@sharky98
Copy link
Author

sharky98 commented Oct 5, 2024

Regarding my PS1 output, it looks to me that we have the same. I added the text after the first > to really show the space.

Now, I was able to launch the demo (after struggling with yarn classic 😅). And... The extra space does not appear when using main. The mystery is darkening! I'll try to check VSCode source to see if I can match their settings and report back.

Demo
image
And the bytes just after pressing "enter".

xterm.js: parsing data "

~Before~#󰃭#~After~%Another%>" 
Array(31) [ 13, 10, 126, 66, 101, 102, 111, 114, 101, 126, … ]
LogService.ts:65:9
[
  13,
  10,
  126,
  66,
  101,
  102,
  111,
  114,
  101,
  126,
  35,
  56192,
  56557,
  35,
  126,
  65,
  102,
  116,
  101,
  114,
  126,
  37,
  65,
  110,
  111,
  116,
  104,
  101,
  114,
  37,
  62
]

VSCode
image

I was able to turn on debug on the terminal for VSCode without having to dig too deep. Seems that VSCode is adding the following bytes to all prompt 27,91,49,67.

2024-10-05 14:36:25.709 [debug] [fe9df4a] parsing data "
~Before~#󰃭#~After~%Another%>�[1C" [[13,10,126,66,101,102,111,114,101,126,35,56192,56557,35,126,65,102,116,101,114,126,37,65,110,111,116,104,101,114,37,62,27,91,49,67]]

Windows Terminal
image

@jerch
Copy link
Member

jerch commented Oct 5, 2024

Jupp, the demo looks good, ">" (62) is the last byte sent (same for me here). vscode has that extra space, wherever that comes from...

Edit: Oh well, that space comes from CSI 1 C, which means "move the cursor one forward". Thats the extra space. But what places that sequence there at the end? What happens, if you remove the PUA char from PS1? Is there is still that sequence added?

@jerch
Copy link
Member

jerch commented Oct 5, 2024

Moving this thread to a discussion, it is clear b now, that this is not an xtermjs issue.

@xtermjs xtermjs locked and limited conversation to collaborators Oct 5, 2024
@jerch jerch converted this issue into discussion #5185 Oct 5, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

3 participants