-
-
Notifications
You must be signed in to change notification settings - Fork 21.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.x] Canvas item hierarchical culling #68738
Conversation
53cfd61
to
2188002
Compare
I thought about this for a while, but I couldn't find a situation where doing this can happen transparently and always be a win. Will have to check the PR in detail. |
09b3b4f
to
63146aa
Compare
Example timings with Jetpaca (10-20x faster)
Where "old" is legacy item culling and "new" is hierarchical culling. Project Manager (3-5x faster)
Editor with Jetpaca loaded (4-8x faster)
|
Don't know why I did not see the previous posts, but this is awesome, congrats! I had something like this in mind since noticing the lack of performance, but it seemed really hard to make it work, so I am very thankful that you took a crack at it, very promising |
63146aa
to
7041786
Compare
Adds optional hierarchical culling to the 2D rendering (within VisualServer). Each canvas item maintains a bound in local space of the item itself and all child / grandchild items. This allows branches to be culled at once when they don't intersect a viewport.
7041786
to
b777a9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I can't see anything that would obviously cause issues. The only concern is that users may stumble onto a set of conditions that we haven't considered. But at this point the best way forward is to merge this and get broader coverage.
I am very enthusiastic about this approach and hopeful that we can polish it, prove the performance benefits, and then add the same or similar to 4.x.
Let's go ahead and merge and let users' batteries and CPUs rejoice
Thanks! |
Absolutely, I'm fully expecting one or two special circumstances that need a slight tweak, but it's easily turn off-able. 👍 |
Was this ever implemented in 4.x? I don't see it anywhere and I'm concerned about visibility enabler performance as in #63193 |
I mentioned to reduz while implementing, but as far as I remember, he wasn't super convinced about having it in 4.x (I think he tried to get this working long ago, but had problems where it was a win for some cases but to the detriment of others). But if there was demand it might be possible to get through politically - there are a lot of non-obvious considerations here. For instance, it does admittedly complicate the 2D culling code which affects maintenance. 3.x is fairly stable (so not such a problem), whereas 4.x is in flux. But if there is enough interest we may be able to get this into 4.x. |
I did see a PR for 2D sprite batching in 4.x, so maybe that will help performance for rendering stuff off-screen. But idealistically if something is off screen it shouldn't be rendered (not just the animation). Like as a user I can't set the visibility to false of the root node while its off screen because its visibility is used for calculating the screen_exited and screen_entered signals 🤦 What about a scenario like this where you have 3D output as 2D using a sprite2D. The 2D would get culled, but what about the 3D in the nested viewport? would you need an Enabler3D that emits signal off the Enabler2Ds signals? (honestly I might have to just use pre rendered 2D because performance of doing this kind of thing (3D->2D), is really bad; although I like the flexibility that 3D provides 😿 ) |
Adds optional hierarchical culling to the 2D rendering (within VisualServer).
Each canvas item maintains a bound in local space of the item itself and all child / grandchild items. This allows branches to be culled at once when they don't intersect a viewport.
Background
VisibilityEnabler
to work with this, it might be possible to add some kind of automatic hierarchical culling, for instance using the scene graph, or a spatial partitioning structure such as BVH or similar.VisualServer
, allowing the possibility for using this directly as spatial partitioning.How it works
Item
- the local bound. This is aRect2
indicating the bound in local space of theItem
and all its non-hidden children and grandchildren.Housekeeping and Rendering
Item
, the bound of the item itself must be marked dirty (to be calculated next time). Additionally, the bounds of all parent items are marked dirty, as they may be modified.Costs and Benefits
There is thus a small housekeeping cost to the technique - probably around 2% (of the time taken by the preparation / culling code). In return the wins are quite significant. Overall the preparation phase is typically 4-10x faster.
In cases where a lot is off screen (and can thus be culled) the gains can be large. In @BimDav 's test project with 300,000 canvas items, the preparation code runs in the region of 16,000x faster, with a similar huge improvement to frame rate.
In the editor there are also speed improvements to the preparation / culling.
However, note that the preparation / culling is not always a major bottleneck, so even though there are huge improvements in the efficiency of preparation code, the overall boosts to frame rate are usually more modest.
Testing in
jetpaca
, I was typically getting increases from around 350 to 400fps, so about 15%.Special cases
Most canvas items are only altered by calling functions in
VisualServer
, and these are thus easy to flag the bounds as dirty when such a change occurs. There are exceptions though for "dynamic" items, where changes are "pulled" rather than "pushed" to the server.Skinned 2D Polygons
Skinned polygons pull their vertex transforms from a
Skeleton
each time theSkeleton
changes. But there is a chicken and egg problem: In order to know whether the skeleton has changed, we need to callget_rect()
on thePolygon2D
, and this only occurs immediately prior to rendering, well after the time we expect to mass reject thePolygon2D
using hierarchical culling.The solution used here is instead of having a one way relationship where
Polygon2D
has a dependency to theSkeleton
, the RID of the linkedPolygon2D
is now stored on the skeleton. Whenever the skeleton moves, the dependent polygons are informed, and their bounds made dirty.This should always work, but is not ideal efficiency wise - it is advisable to use
VisibilityEnabler2D
for each skinned character, which will prevent animation when off screen, and thus the bound will not need updating.Particles
Particle bounds are not actually currently dynamic in 3.x. Turns out GLES2 returns
Rect2()
, and GLES3 can only return a custom rect. So they should work as is without modification for hierarchical culling.Vertex Shaders (that move verts)
These would probably need the user to make a custom rect or apply expansion margin.
Notes
project_settings/rendering/2d/options/cull_mode
, betweenItem
mode (old style) andNode
mode (which is now the default).canvas_item
names to theVisualServer
, which enables you to identify nodes when printing the tree. This is normally switched off to save memory and performance. This can also be helpful for general 2D debugging in theVisualServerCanvas
.Optional defines (in
visual_server_constants.h
)VISUAL_SERVER_CANVAS_TIME_NODE_CULLING
- every 100 frames it runs bothItem
culling andNode
culling, timing both, and displaying the timings usingprint_line
. This enables direct comparison in different projects / scenes, and can be used in release.VISUAL_SERVER_CANVAS_DEBUG_ITEM_NAMES
- pass canvas item names to VisualServer for debugging.VISUAL_SERVER_CANVAS_CHECK_BOUNDS
- performs verification checks on all bounds to make sure they are correct and up to date, in order to detect bugs.