Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass the return of fill_path in a pointer #382

Merged
merged 1 commit into from
Oct 27, 2023
Merged

Conversation

Zoxc
Copy link
Collaborator

@Zoxc Zoxc commented Oct 12, 2023

This is just a workaround for gfx-rs/wgpu#4393.

Copy link
Contributor

@raphlinus raphlinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this seems like a reasonable workaround, and not entirely unidiomatic. It reminds me of programming K&R-style C before returning structs was allowed, back when I was a kid.

raphlinus added a commit that referenced this pull request Oct 24, 2023
Create a specialized version of the fill function for the even-odd fill rule. The logic is simpler (and faster) because winding number accumulation can happen in one bit.

There's a bunch of code duplication which can be cleaned up.

It's expected this will have a merge conflict with #382. If that's merged first, I'll happily fix this one.
@Zoxc Zoxc merged commit 1aed7fb into linebender:main Oct 27, 2023
@Zoxc Zoxc deleted the dx12-tweak branch October 27, 2023 08:15
raphlinus added a commit that referenced this pull request Oct 30, 2023
* Prototype 8 bit winding number accumulation

Changes the winding number accumulation in msaa8 mode to 8 bits per sample. This is prototype code and currently breaks the msaa16 mode; it is intended to diagnose whether the artifacts are strictly due to overflow, and to point the way to a real implementation.

Prefix sums in both x and y direction are a little cleaner, avoiding a race (not UB because it's atomics).

* Make msaa16 mode use 8 bit accumulation

This patch makes the msaa16 mode work again, using 8 bit accumulation of winding numbers. It could be merged to fix the artifacts in the cardioid example. Also, it's worth doing some evaluation to see how much performance slowdown there is.

As future work, we probably want to be adaptive and use 8 bit accumulation when needed. If the performance hit from reduced occupancy due to the increased shared memory usage is significant, then we could consider other mitigations, including downgrading to msaa8 when overflow is possible.

* Add even-odd fill rule

Create a specialized version of the fill function for the even-odd fill rule. The logic is simpler (and faster) because winding number accumulation can happen in one bit.

There's a bunch of code duplication which can be cleaned up.

It's expected this will have a merge conflict with #382. If that's merged first, I'll happily fix this one.

* Prepare for merge

* Add a bunch of comments

I did my best to document some of the strange bit magic used in the algorithm. I also did just a bit of renaming to make things simpler, and for the mask expansion replaced `|` with `^` because it's easier to understand in terms of carry-less multiplication (and I expect performance to be identical).

* Typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants