Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh context only after thread switch #520

Merged
merged 2 commits into from
Jul 28, 2016
Merged

Refresh context only after thread switch #520

merged 2 commits into from
Jul 28, 2016

Conversation

EliasC
Copy link
Contributor

@EliasC EliasC commented Jul 28, 2016

This commit tries to keep the number of calls to pony_ctx to a minimum
by only updating it after a (potential) switch of threads, but then doing it
"across the stack" by using a pointer to a stack variable in the run
function of the scheduler. Most functions have been changed to pass
around this double pointer to the context (rather than just a pointer), and
the most important line is line 236 of encore.c:

*ctx = pony_ctx(); // Context might have gone stale, update it

No other part of the code should need to write to *ctx.

This commit fixes #434.

This commit fixes #434. It tries to keep the number of calls to pony_ctx
to a minimum by only updating after a (potential) switch of threads, but
then doing it "across the stack" by using a double pointer to a stack
variable in the `run` function of the scheduler.
@kikofernandez
Copy link
Contributor

kikofernandez commented Jul 28, 2016

BTW, this PR runs GCBench 11% faster than the current development branch when compiled with -O3!

@EliasC
Copy link
Contributor Author

EliasC commented Jul 28, 2016

@albertnetymk: since me and @kikofernandez wrote this, could you review it?

@EliasC
Copy link
Contributor Author

EliasC commented Jul 28, 2016

BTW, this PR runs GCBench 11% faster than the current development branch!

I could only confirm 3%, but that's still better!

@kikofernandez
Copy link
Contributor

You should not use the Makefile from that benchmark but encorec -O3 GCBench.enc

@EliasC
Copy link
Contributor Author

EliasC commented Jul 28, 2016

bash-3.2$ encorec -O3 GCBench.enc
bash-3.2$ time for i in `seq 10`; do ./GCBench; done
...
real    1m3.897s
user    1m58.835s
sys 0m3.335s

(6.39s average)

vs.

bash-3.2$ encorec -O3 GCBench.enc
bash-3.2$ time for i in `seq 10`; do ./GCBench; done
...
real    1m5.865s
user    2m4.810s
sys 0m3.342s

(6.59 average)

@kikofernandez
Copy link
Contributor

in my machine, using your same loop:

-- Running GCBench from this PR
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:49.53

-- Running GCBench in `development`
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:53.83

This suggests a speedup of 8.6% (53.83 / 49.53)

@EliasC
Copy link
Contributor Author

EliasC commented Jul 28, 2016

Nice!

@albertnetymk
Copy link
Contributor

I originally thought suspend, await and get all need to taken care of in order to do this double pointer trick, but apparently, they use the same code for context manipulation. The final ctx update in future_block_actor looks redundant.

I was expecting clang to cache *ctx, but looking at the assembly, it seems to be doing deref all the time. I suggest to put pony_ctx_t **cctx as the first argument, and pony_ctx_t *ctx = *cctx as the first statement if ctx is needed in the current function.

  pony_gc_send(*ctx);
  pony_trace(*ctx, entry);
  pony_traceactor(*ctx, (pony_actor_t *)entry->actor);
  pony_send_done(*ctx);

I suggest using (void)var to surpass the unused var warning, instead of placing __attribute__((unused)) in the signature, so that the declaration in the header and the definition are consistent and clean.

The call to pony_ctx() in actor_run seems redundant.

@EliasC
Copy link
Contributor Author

EliasC commented Jul 28, 2016

The final ctx update in future_block_actor looks redundant.

I thought we removed that! Fixed now

I was expecting clang to cache *ctx, but looking at the assembly, it seems to be doing deref all the time.

We were expecting the same! Fixed as per your suggestion now.

The call to pony_ctx() in actor_run seems redundant.

Agreed. Fixed.

@albertnetymk
Copy link
Contributor

I think this can be merged. Would do so in an hour if no objections.

@kikofernandez
Copy link
Contributor

kikofernandez commented Jul 28, 2016

To everyone:
I think it was great to have an open discussion in #517 which lead us to try a new and more complex implementation (this one) that fixes #434, allows us to re-introduce the ParT extract combinator and cleans most of the calls to encore_alloc and pony_alloc. Great team effort!

@albertnetymk albertnetymk merged commit 48fd23a into parapluu:development Jul 28, 2016
@EliasC EliasC deleted the fix/ctx-refresh branch July 28, 2016 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

closure built after get is subject to stale context
4 participants