Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix object layout races #285

Merged
merged 23 commits into from
Feb 5, 2019
Merged

Fix object layout races #285

merged 23 commits into from
Feb 5, 2019

Conversation

daumayr
Copy link
Contributor

@daumayr daumayr commented Jan 14, 2019

This PR fixes issue #85.

The Phaser used in ObjectTransitionSafepoint used a default Phaser, which Terminates when onAdvance(...) returns true. This caused the synchronisation methods to have no effect.
The default onAdvance implementation causes the Phaser to terminate when the number of registered Threads is 0.

By overwriting the onAdvance method and returning false we can prevent Phaser from terminating and preserve the synchronisation.

@daumayr
Copy link
Contributor Author

daumayr commented Jan 15, 2019

While tests are running for this branch, we still have an issue with some benchmarks. The Vacation Benchmark still deadlocks, so ideally we don't merge this PR until we fix the Join Prim and similar things.

@smarr
Copy link
Owner

smarr commented Jan 15, 2019

Ok, will see what I can do about those. Will probably try to do them in a separate PR to see performance issues independently.

@daumayr
Copy link
Contributor Author

daumayr commented Jan 15, 2019

I think we need to exclude those benchmarks (LeeTm and Vacation) until then.

Repository owner deleted a comment Jan 16, 2019
Repository owner deleted a comment Jan 16, 2019
Repository owner deleted a comment Jan 16, 2019
Repository owner deleted a comment Jan 16, 2019
Repository owner deleted a comment Jan 16, 2019
Repository owner deleted a comment Jan 16, 2019
Copy link
Owner

@smarr smarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick review to give at least some comments.
Sorry, didn't get to more.

This change introduces our formatting and names the class SafepointPhaser and sets the package name correctly.

No changes of functionality.

Signed-off-by: Stefan Marr <[email protected]>
Repository owner deleted a comment Feb 1, 2019
Repository owner deleted a comment Feb 1, 2019
smarr and others added 5 commits February 2, 2019 13:57
Co-authored-by: Dominik Aumayr <[email protected]>
Signed-off-by: Stefan Marr <[email protected]>
The SafepointPhaser should not terminate, even if no parties are registered on it anymore.

This can happen with our safepoint design, but it is not safe to terminate the phaser. It needs to continue working and accept threads to join again.

Co-authored-by: Dominik Aumayr <[email protected]>
Signed-off-by: Stefan Marr <[email protected]>
Remove interruptible, timed, and deadline notions.

Signed-off-by: Stefan Marr <[email protected]>
SafepointPhaser isn’t a general purpose class, so, let’s provide features only when needed.

Signed-off-by: Stefan Marr <[email protected]>
daumayr and others added 4 commits February 2, 2019 13:58
Co-authored-by: Dominik Aumayr <[email protected]>
Signed-off-by: Stefan Marr <[email protected]>
This change uses the knowledge of the safepoint consisting of two phases. This allows us to manage registrations without extra race, and it allows us to renew the assumption as part of the arrival routine in the phaser.

Signed-off-by: Stefan Marr <[email protected]>
@smarr smarr added this to the v0.7.0 milestone Feb 2, 2019
@smarr smarr added the bug Fixes an issue, incorrect implementation label Feb 2, 2019
Repository owner deleted a comment Feb 2, 2019
Repository owner deleted a comment Feb 2, 2019
@smarr
Copy link
Owner

smarr commented Feb 2, 2019

@daumayr I come up with what I think is a working solution. Could you please review and possible test this?
I haven't yet thought of a good test on how to stress the safepoints.
Not sure I can induce repeated safepoints easily.
Would be good to discuss this in detail, to see whether there are any know races left, or any other issue that may make this approach problematic.

The most important bit to discuss is likely 0274546

@daumayr
Copy link
Contributor Author

daumayr commented Feb 4, 2019

(phase & 1) == 0
I was thinking of using something similar when i did some work on the safepoints. When no safepoint is going on and all threads deregister the phaser advances to the next phase. After that the even/odd check would no longer work. You might want to change the doArrive method to not advance phase when there are no threads left.

@smarr
Copy link
Owner

smarr commented Feb 4, 2019

@daumayr hmmmm, ok. Didn't think of that. Looking at the code: https://github.com/smarr/SOMns/pull/285/files#diff-2b3be932b680eb719ab91a02f3b6f064R137

I suppose, I could simply increment the phase in that case by 2, no?

@smarr
Copy link
Owner

smarr commented Feb 4, 2019

hmmm, well, possibly depending on in which concrete state we are. Though, we should never unregister during a safepoint. So, probably increment by 2 plus an assertion that we are never in a phase that indicates a safepoint is going on

@daumayr
Copy link
Contributor Author

daumayr commented Feb 4, 2019

Increment by two sounds good, that way we still have a phase change and keep the odd/even.

@smarr
Copy link
Owner

smarr commented Feb 4, 2019

ok, cool.

and one more todo:

  • add unit tests that test the basic logic for correctness, and have asserts for all the parts not covered

@smarr
Copy link
Owner

smarr commented Feb 4, 2019

@daumayr could you give this a spin with your snapshot tests?
Would also be great if you could review it and look at the added test. Any obvious test we should add?

Repository owner deleted a comment Feb 4, 2019
Repository owner deleted a comment Feb 4, 2019
@daumayr
Copy link
Contributor Author

daumayr commented Feb 5, 2019

Snapshot Tests running, didn't see any errors.

@smarr smarr merged commit 29a2c9f into smarr:dev Feb 5, 2019
@daumayr daumayr deleted the FixObjectLayoutRaces branch February 5, 2019 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes an issue, incorrect implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants