-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the canvas renderer addon #4779
Comments
Oh dear - it happened. RIP dear canvas renderer... |
Apologies for commenting post-facto. I had missed the fact that canvas was going away or I would have written sooner. In my use case, I depend heavily on canvas renderer as I find that it provides several benefits over web GL. Before getting too deep in to it, I'd like to test the waters to on whether it's open to discussion keeping canvas alive. To provide context to any other readers: I'll briefly note that canvas has served as a strategic middle point between the HTML renderer and something more performant. HTML rendering lags on performance, memory usage, and rendering accuracy (compared to the optimized addons). HTML renderer is generally untennable for any large quantity of text. Canvas solves this problem. WebGL addon does too, but comes with caveats. WebGL uniquely presents some interesting challenges which I think make it questionable for production use in some cases:
For many products, tools, and myself, it's very reasonable to have more than 8-16 live terminals on an origin. Mission-critical applications are written in HTML+JS these days. Unavoidable. This may seem like a hyperbole, but: The loss of a terminal mid keystroke is /like/ a power-out happening on your serial-connected emergency console during an incident. WebGL crapping out on you, or suddenly becoming unavailable... is unacceptable for an SRE-type or even a SWE doing a mission-critical task. Our largest companies tech companies (myself a former Googler knows it well) run everything in a browser, as does vscode. So, I'd love to see whether we could investigate what a world would look like if we did maintain both canvas and webgl, as I believe that canvas provides critical stability (and performance) when it comes to the risk of sudden webgl outage /or/ limits being hit. I know it's a hassle to maintain both GL and Canvas. Thanks for your time and consideration. Happy to conitinue discuss, regardless of outcome. |
@davidfiala the dom renderer is actually way faster now than it used to be. That's why removing it was being considered again to begin with. The issue with keeping it around is that I end up having to waste time maintaining it and any feature implemented needs to be considered for 3 different renderers. It also had some serious design flaws, such as cells not being able to draw beyond their line which means for example that underscores could become completely invisible. Additionally I'd like to eventually port the webgl one to webgpu but definitely wasn't going to attempt that only to them maintain 4 implementations. For the context loss issue there are hooks to listen for that and handle it, you shouldn't be facing data/connection loss as it's just the renderer that needs to be switched. I'd like this to be handled as smoothly as possible so if you have ideas to make it better I'd love to hear them. VS Code has these same gl context limitations and I haven't heard any complaints for a long time after the initial fixes to context loss handling. |
Thanks for the note, @Tyriar. I totally understand the challenge, and empathize with the difficulty of multiple impls. Re DOM renderer: I think one problem I ran in to was that keeping a substantial amount of scrollback was a big showstopper. Maybe it is better and I just need to try again. Does DOM renderer add scroll history on-demand, or leave it permanently in the DOM? (to be fair, it sounds hard to accomplish on-demand history loading/unloading) I'm assuming that vscode is using the webgl renderer by default? In vscode: I loaded up a bunch of terminals all running a constantly-changing visual app. In this case The broken panels do come back in to existence eventually. But it makes me wonder: How? If loading more than 16 panels or so causes oldest to crash (presumably due to context loss), is it just grabbing another context? Is the one it "stole" a context slot from.. also crashed now? Are they just ping-ponging around between 16 slots many times a second? I'm not sure yet- I can definitely experiment more. -- Back to your original request for ideas: Here's a brainstorming question.... Could xtermjs be aware of multiple instances of itself, and multiplex a single GL context to render all live instances? ie, with 1 context, write N different terminals to N different output locations? ;) Sorry if this is an uninformed idea... I'm not very familiar with webgl. |
Hmm I wasn't aware, that browser engines restrict gl context usage to those low numbers (btw they also do this for audio context, which for some reason I was aware of...) To me your idea seems to be the only "workaround" in the long run. But note that this will be rather cumbersome to achieve, as it would involve a global (as of js-context-wide) driver-like manager dealing with the scarce gl resource. xterm.js makes heavy usage of texture RAM from the glyph cache, which ofc will see even worse thrashing behavior, when shared across several terminals at once. (Idk how browser engines deal with that on the level of separate gl contextes, whether they have access to faster blitting primitives than webgl offers etc) @Tyriar Could we learn here from other highly integrating 3d libs like three.js? Do they have a global gl manager? |
@davidfiala it doesn't, it should perform the same with 0 and 100000 scrollback as it only considers the viewport. Unless you have
This is just a case that people hit so rarely that no one reports it. I think we handle recovering if the context was lost when it was in the background and then replace it. For this particular case we could handle this such that if there are so many visible at once that there are no contexts available, for that particular terminal temporarily fallback to DOM.
I'm actually working on a webgpu renderer for the monaco editor and I actually had this exact idea recently to share a canvas between multiple editors in order to solve the problem of notebooks that can have an unbounded number of editor instances. The main difficulty for applying this idea to xterm.js are:
I wouldn't be able to spend time on this but someone else could have a look if it's very important to them. Personally I think for xterm.js this is such an edge case that it's not worth it when you can just fallback. Also I'm not sure if webgpu has these same limitations, porting it to that could be the most forward-looking solution.
@jerch I haven't used it but I had a look at their docs and I don't think they have anything special here. |
@Tyriar I know this is like a never ending story - is it about time to re-eval a possible usage of offscreencanvas here? It seems, that the feature coverage got much better recently, also see the support table here: https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas#browser_compatibility (ofc safari doesnt support the contextlost event, haha what a nightmare...) |
@jerch that would only be useful if we were going to do rendering in a worker which isn't worth the hassle imo as it would cause sync problems issues and add a bunch of complexity when most draws are already < 1ms CPU time. I think offloading the parsing if anything would be the way to go, but I've looked at that several times and it's difficult without encoding all our state into array buffers. We do use OffscreenCanvas to measure text metrics now: xterm.js/src/browser/services/CharSizeService.ts Lines 104 to 127 in 8f8c94a
The main reason we would use it beyond that is for the texture atlas and I think all that would give is just make this a little more elegant: xterm.js/src/browser/renderer/shared/TextureAtlas.ts Lines 1095 to 1100 in 8f8c94a
|
Regarding three.js, I found: https://discourse.threejs.org/t/use-one-renderer-for-two-canvases/51872 It had some useful demo links of putting the same GL in multiple places: Side note: I did some benchmarking. On my machine, I can acquire 1000 2d context's (canvas) in 5ms. The WebGL acquisition is 1,000x slower, taking 5500ms for 1000. That's notable if for any reason we ended up doing round-robin context stealing as a short-term workaround. At the same time, the limit for me is 16 contexts in chrome. After that I observe that randomly either one of these occurs when I need 1 more:
|
Thanks all for the thoughts and discussion. Having a reliable and performant renderer is really important to myself and I believe many others, so it's very much appreciated to continue to look in to this. One question I have is: Until we have conclusion on the future of renderers in xtermjs, is it possible to undelete the canvas addon? I've tried a few drafts to bring my thoughts together, and I'm finding it hard to frame this question, so bear with me: Are we in a situation where we can only pick 2 of 3 below?
I feel that it might be hard to claim we can have all 3 today due to the technology available. The risk that a terminal can crap out due to auxiliary failures or limitations imposed by the browser, os, gpu, etc makes that terminal unfit for SRE or mission critical work. And what is a terminal if not specifically the lowest level thing we have to debug today's infrastructure? A reliable, scalable terminal is critical. But with that reality in mind, this may be a chance to step back and define what the values are moving forward.. even if this was supposed to just be a thread about removing the canvas addon. As was pointed out before, canvas has it's flaws such as failing to show underlines. But, on the other hand, it's damn performant, works everywhere, doesn't require a GPU, and scales infinitely without a dependency on any "hardware resource" like a GL context. It's suitable for SREs to use. Maybe the canvas addon is not the future. But knowing what our values are will help us define what renderers to support and when to change relative to tech available. And critically, it also allows downstream users to decide whether xtermjs is aligned with their long term needs. Maybe WebGPU is the way to go, and it's webgl that should be on the chopping block next? Why pick on canvas? Maybe DOM renderer should be the one we pick on, if we really want to drop some eng load? Today, I rate the renderers as follows:
I know there's competition in this space. {Insert-popular-new-language}-based newfangled native renderers that can provide 3000fps animated HDR ligatures are for some reason popular. And while they have their time and place, they aren't what I would recommend to my SREs or SWEs or my enterprise customers, especially on a HTML+JS stack. So: Do we want xtermjs to be a tool in the box for SRE/mission critical software? Buttery rendering users? Someone else? All of the above? I'm really hoping all of the above, through choice. I want to give my users the choice to upgrade to fancy mode when applicable, but by default push them to SRE mode*. Having my only option be to fallback to DOM renderer scares me though: Fallback introduces a snap change that has vastly different resource/UI requirements on the browser, lags/breaks possibly critical operation being performed by end users, and may break layouts at a time that the user is most vulnerable mid-action. *today in 2024, SRE mode is canvas renderer --
|
@davidfiala Have you actually benchmarked the DOM renderer of a more recent version? Here are some numbers from my side: #4605 (comment) Plz note the comparison with the canvas renderer, which actually performs worse with its 100% CPU saturation and tons of frame drops. 😺 Edit: And to put these numbers in perspective to other TEs, see #4604 (comment) |
It's plenty fast for the vast majority of use cases. Here's an example of the sort of times I get on my machine for a full viewport render (189x26 terminal running Compared to webgl: This is perfectly reasonable and reliable now (it wasn't before @jerch looked into it). The main thing you're missing out on with DOM is custom glyphs and render times/input latency as low as we can get them.
I don't think the canvas renderer meets 2, on some setups like when GPU acceleration is not enabled on the machine it will render using the CPU and become extremely slow. VS Code had some special frame measuring logic that I was very glad to get rid of. Here was the fallback chain:
There must always be a fallback to DOM because the bad number canvas showed above, in the end it's rendering with the GPU anyway in the browser engine, just with a bunch of layout stuff happening (which we now minimize). I appreciate the passion, but I think you should re-evaluate the DOM renderer which really is great now and in many cases better than canvas. We removed canvas because it's obsolete and has become a significant burden, it only exists to begin with because I didn't know webgl when I implemented it 7 years ago. The canvas renderer also missed out on some advanced features we recently did because it's too much of a hassle to implement (see skipped tests). |
Thanks folks. I'll do some more dilligence and get back to you. I recall that previously with the DOM renderer one particularly cumbersome situation was browser resize. Not in that it was difficult to tell xtermjs the new dimensions, instead that it just hung the browser for some time. I was already debouncing resize requests at that time, FWIW. I'll do my best to see if there's still cases that fail for me and report back. But even so: It'll be sad to some big extent to see the improvements of better renderers be lost if users are forced to give up on WebGL because we cannot tolerate the risk of context loss during a critical operation. I'm still not sold that I want to risk fallbacks, at least in my case. My users depend on the terminal working consistently. Frankly, I think vscode devs do too. Far more people are using dev tools to manage production than they should be... I doubt we'll ever hear about bug reports though. People may not even realize that xtermjs powers vscode. Part of my evaluation will include whether I can get external load to crash web GL contexts on the host system. Meanwhile, I hope that the links #4779 (comment) in can help with a path forward to reduce WebGL context usage, at a minimum. It doesn't solve my argument that we cannot trust in webgl, but it does at least provide 16+ contexts and maybe reduce the chance of a loss. Anyhow, dropped frames or high CPU usage aside, I hope there's an open mind towards still considering the use of canvas as more data comes in. Canvas may come with caveats, and it may sometimes use CPU vs GPU, but it works reliably. And it's the best in-between that we have. I don't need 300fps, 30fps, or even 3fps. I don't care about dropped frames and neither do my users. I care that in general it's reasonable looking and functional. I know the upper-bound cost of using it isn't prohibitive and it won't ever freak out the DOM tree because it never changes the DOM tree (I think?). It's worked admirably (AFAICT) as a happy medium across all spectrums. I'll keep you all posted as I can work on this. BTW, is the source tree known reasonable at the time that canvas was removed? I may need to snapshot or keep a fork with it cherry pick'd back in for a while if I intend to adopt any other improvements that v6 will include. |
One other note: fallback to DOM in my experience was pretty hard to code for. Getting near pixel perfect container sizes and pixel widths of cells was tough. It seems to differ between fallback modes. Developer difficulty aside, the fallback simply remains a risk. Dropping keystrokes, never recovering, and so. It feels like open heart surgery, even if it works most of the time. |
You'd be best sticking with the release before #5092 as this and after contained some really big changes which haven't been battle tested yet (only in VS Code insiders, not stable). Unfortunately we don't have a good link between which commit points at which release, but basically it's the first one that does not include any .mjs files.
I think canvas sharing would be a good thing to support but not sure I'll have time to work on it, though I may experiment with the idea in the monaco editor codebase.
I triage all the issues coming into the vscode repo too so I have a pretty good idea of the stability of it. I'm actually a little surprised how little I've heard post it's removal was announced in https://code.visualstudio.com/updates/v1_90#_-removal-of-the-canvas-renderer
Not sure what you mean here, a lot of elements are recycled and reused but there's a lot of changes inside the
If you have suggestions to make this better let us know. I know it's not great as I see it's not even properly handled in VS Code yet. Pretty low priority on my end though,.
They're completely separate systems, you should never lose any keystrokes as input is routed elsewhere |
@davidfiala If your software uses a recent version of xterm.js - could you plz check first, if the DOM renderer is really that bad for you? This should be quite easy, just dont load the canvas and webgl addons. We are not getting anywhere, if we keep telling you, that the DOM renderer got much better in the last months, and you respond to that with vaguely phrased issues of a much older version, without having tested against a newer version. I dont say, that it will work flawless now (in fact - we never had a single flawless renderer to begin with, they all have their rough edges for very different reasons), but perf got so much better, that your point 2 should hardly be an issue anymore. I totally understand, that this step might create uncertainty on your end. But simply voting to keep the canvas renderer alive will not help us, as the burden would be quite high. Instead I suggest to re-eval the DOM renderer and to focus & fix issues found there. |
Out of curiosity, when using a worker, is that be one that is accessible by multiple windows(tabs) across the same origin? It popped in to my head last night that perhaps having 16 tab or frames sharing the same context limit with 1 context per tab might pose a challenge if they aren't able to communicate or blit among themselves without being in the same JS engine instance. |
@davidfiala we don't use workers right now but it we were to I doubt we would move the rendering there as parsing incoming data is the more expensive task. There's also the concern that a worker could increase input latency which is one of the more important perf metrics. |
Regarding webworkers (somewhat offtopic): This would be much better with SABs for data transfer, but those are hard to set up correctly due security constraints (geez, for quite some time it was not clear, if they would come back at all). And beside the bytearray serialization they also need a proper locking strategy, so the impl bar is quite high. All in all - workers are not that sexy for typical xterm.js load (at least not for me anymore). Geez, I wish we had an Erlang clone in browsers, or something similar with lightweight threading. I think the W3C missed a chance here with web components - while they separated much on DOM/CSS layouting side, they kept JS in that single global context. |
Yeah, whenever I talk about workers I'm thinking it must be SAB or we'd lose all the benefits. And that means moving all state that the parser impacts into SABs |
Context from #4737 (comment)
We should remove the canvas renderer in the next major version to reduce maintenance overhead.
The text was updated successfully, but these errors were encountered: