Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax atomic memory ordering constraints #10

Closed
faern opened this issue May 8, 2020 · 9 comments · Fixed by #17
Closed

Relax atomic memory ordering constraints #10

faern opened this issue May 8, 2020 · 9 comments · Fixed by #17
Labels
enhancement New feature or request

Comments

@faern
Copy link
Owner

faern commented May 8, 2020

Currently all atomic memory orderings use SeqCst. This is to be conservative and avoid some bugs due to being careless early on in the development when lots of stuff change around. When the code matures somewhat a lot of the orderings can probably be relaxed.

Since this is tricky. Having the relaxed orderings audited by someone else who is good at this would be awesome.

@faern faern added the enhancement New feature or request label May 8, 2020
@jjl
Copy link

jjl commented Jul 27, 2021

Hello, I'm the author of a different oneshot channel.

Anyway, more than the memory orderings, what stuck out to me is you're missing out on a few neat tricks that allow you to really speed it up. In async-oneshot, I use an atomic integer as a bitmask. This allows for using atomic bit operations to take the locks. Atomic bit operations can't fail. You can for example issue a fetch_or of LOCK (a constant with a particular bit set) with Ordering::AcqRel and then just check the bit wasn't already set in the return value. Combine that with a spinloop (because you only ever take the lock to move a value) and you're golden.

But you did ask about memory orderings, so, to answer: you only need acquire/release semantics for a oneshot channel implemented with a single atomic. There are some situations where you can get away with relaxed (basically when you aren't going to access other memory on the basis of the atomic - you just want the atomic).

I've actually rewritten async-oneshot recently to support reuse (i.e. no longer be oneshot), but it still retains excellent performance characteristics (though we have to take a lock on write because i've stopped flagging the message. i might bring back the old behaviour, i'm not sure. at the moment. i'm a bit fatigued with it having done 3 consecutive rewrites since the last release and still not being entirely happy...).

I'm happy to answer any questions you may have. Or to collaborate, if you're up for it. I have big plans for async rust and it would be awesome to work with someone else on it. Or even just to get feedback really lol.

Cheers,
James

@faern
Copy link
Owner Author

faern commented Aug 29, 2021

I don't think I understand what lock you think I should use an atomic bitmask for? Could you link to some code in this repository?

I already use an atomic integer to read and manipulate the state of the channel. There is no locking going on in this library.

@jjl
Copy link

jjl commented Aug 30, 2021

Oh I see, on closer inspection, I got confused by the sync implementation.

However, I did notice a bug here. This violates the futures API contract, where you are always supposed to set a new waker. Whether it will happen to work correctly depends on your chosen executor's implementation.

When you consider this case, I expect you will find a spinloop necessary (even async-waker uses one).

@faern
Copy link
Owner Author

faern commented Aug 30, 2021

Thanks! Yeah I made that mistake in another crate as well: triggered. But I never fixed it here 🙈

I should probably fix that.

@faern
Copy link
Owner Author

faern commented Nov 23, 2021

@jjl Hi again. It took me a loooong time to get around to look at this keep-last-waker-not-first-waker-issue. But I have now implemented what I believe is a proper fix. I have not merged it yet. If you would like to, I would appreciate a review: #15

I solved it without a spinloop. The receiver can always make progress without looping here. It can try to replace the old waker with a new waker once, and if it fails (because the state changed during the swap) then the sender must have made some progress that allows us to skip updating the waker, and just return instead. So no extra dependencies, and no extra looping.

@jjl
Copy link

jjl commented Nov 23, 2021

okay, i think this will work.

github spews a ton of warnings at me about the compare_and_swap though, you might want to update those to compare_exchange_*.

Cheers,
James

@faern
Copy link
Owner Author

faern commented Nov 23, 2021

Thanks. Yeah, I'll get to update those later. But those are not bugs, just deprecation warnings. compare_and_swap will continue to work.

@jjl
Copy link

jjl commented Nov 23, 2021

indeed, they have no impact on correctness :)

@faern
Copy link
Owner Author

faern commented Aug 20, 2022

A lot of orderings were relaxed in #17. But not all of them. The remaining orderings are appropriately set in #20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants