Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Texture depal using CLUT loaded from framebuffers, and more. Fixes Burnout Dominator lens flare #16014

Merged
merged 21 commits into from
Sep 16, 2022

Conversation

hrydgard
Copy link
Owner

@hrydgard hrydgard commented Sep 12, 2022

See #11100 .

  • Allows shader depal CLUT8 reads from RGBA 8888 framebuffers (the game uses it to read a single texel, full depal just not appropriate, though I guess for generality... But meh for now).
  • Allows doing "gpu depal" for static textures with dynamic CLUTs from framebuffers
  • Enables Bitmask emulation for Burnout Dominator since it's required for the effect

Some of the stuff I'm doing here I'd really like to generalize better eventually, maybe not all in this PR.

TODO-list:

  • Fix OpenGL ES build
  • Fix effect on OpenGL in high res (works in 1x)
  • Fix effect on OpenGL ES. Seems the Z buffer init isn't working.
  • Fix effect on D3D11 in high res (works in 1x)
  • Try to make the buffer matching for copies cleaner (current heuristic causes a "green flash" the first frame since we don't know it's a CLUT until it's been used as such)
  • Depal with palette from framebuffer, to avoid the readback (kills performance on mobile)

Will not try to implement this on D3D9, at least not initially. Biggest obstacle is the bitmasking (gonna need lookup table textures, or icky special casing...)

Burnout.Lens.Flare.mp4

@hrydgard hrydgard changed the title A pile of tricks to get lens flares in Burnout to work A pile of tricks to get lens flares in Burnout Dominator to work Sep 12, 2022
@hrydgard hrydgard force-pushed the shader-depal-clut8-8888 branch 2 times, most recently from d9b2f51 to c2b52bf Compare September 12, 2022 21:33
@hrydgard hrydgard added the GE emulation Backend-independent GPU issues label Sep 14, 2022
@hrydgard hrydgard added this to the v1.14.0 milestone Sep 14, 2022
@hrydgard hrydgard changed the title A pile of tricks to get lens flares in Burnout Dominator to work Texture depal using CLUT loaded from framebuffers, and more. Fixes Burnout Dominator lens flare Sep 14, 2022
@hrydgard
Copy link
Owner Author

hrydgard commented Sep 14, 2022

Added CLUT loads from framebuffers, complete with color reinterpret support, as required by Burnout.

So this now works without readbacks in Burnout Dominator, but Ridge Racer's framebuffer margin shenanigans currently break it when it tries to load a CLUT from the margin and we misdetect it as loading from the main framebuffer... EDIT: Never mind, that's fixed now.

@hrydgard
Copy link
Owner Author

I can remove the CLUT download code entirely now, but can also keep it around in a disabled state. Can be useful for checking the read-back values without using RenderDoc...

@hrydgard hrydgard marked this pull request as ready for review September 14, 2022 21:55
@hrydgard
Copy link
Owner Author

Opening for review, though I have one checkbox left to do so ignore that part if you look at this.

assets/compat.ini Outdated Show resolved Hide resolved
GPU/Common/FramebufferManagerCommon.cpp Show resolved Hide resolved
// First we use a blit (with nearest interpolation, so we don't mash pixels together)
// to shrink to the correct size, if we are running with scaling.
// We can always blit 512 pixels even if we only need less, the cost will be negligible.
framebufferManager_->BlitUsingRaster(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, ideally we'd do this on loadclut. I could totally see a game rendering a clut, loading it, and then rendering straight away on top of the rendered clut. In fact, Brave Story might even do this? I don't remember... well, probably not.

-[Unknown]

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, there's one issue here, I think we then have to always copy to the temp here, and do the reinterpret (if needed) at the time of ApplyTextureDepal, since the specified clut format can change in between.

That does sound reasonable though, I will take care of that later. It's just an extra tiny copy in the case of matching formats.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@hrydgard
Copy link
Owner Author

hrydgard commented Sep 15, 2022

Thanks, will address these.

I got another idea by the way. The CLUT framebuffer gets oversized because we look at viewport and stuff to determine the framebuffer size. I think that for framebuffers that only ever get small through-mode draws, we could keep them in a special mode where we pre-decode each draw to get the positions and only use those for the size - that should help for many oversized targets used in post-processing and similar. But that's out of scope for this one.

@hrydgard
Copy link
Owner Author

hrydgard commented Sep 15, 2022

OK so this is still not quite working properly on GLES (the depth masking stuff doesn't seem to work, and it's not about the range scaling, though that should be fixed too) and the heuristics aren't super clean. But I'd still like to get it in, so people can try it, and so that we can build on this, the last issues can be fixed separately.

@hrydgard hrydgard merged commit ca2962b into master Sep 16, 2022
@hrydgard hrydgard deleted the shader-depal-clut8-8888 branch September 16, 2022 06:33
@unknownbrackets
Copy link
Collaborator

I think that for framebuffers that only ever get small through-mode draws, we could keep them in a special mode where we pre-decode each draw to get the positions and only use those for the size

Mostly, this is what the "safe area" of framebuffers are already doing, although it's focused on clears. Sizing the actual framebuffer is harder because a lot of these temp buffers get used at different sizes throughout the frame.

-[Unknown]

@hrydgard
Copy link
Owner Author

Right, though extending the safe mechanism to non-clear draws that way might have some use. It indeed won't always fix things though, that's for sure...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GE emulation Backend-independent GPU issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants