Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a faster (more time efficient) data structure for clock.timers #159

Closed
akhilkedia opened this issue Apr 3, 2018 · 19 comments
Closed
Labels

Comments

@akhilkedia
Copy link

Summary
Right now, clock.timers in Lolex is an array, which is looped through completely everytime in firstTimer() or LastTimer().
This is extremely inefficient, and perhaps other data structures such as a heap might be much faster.

Background Details
Lolex is great for running simulations - but in one of my recent projects, I attempted to use Lolex to run a code over a simulated period of 1 week - the code made a LOT of setTimeouts, and these were all handled by Lolex.

Profiling the code showed the majority of my simulation runtime was spent in Lolex's firstTimer() function, because it kept looping through a long array of timers everytime.

Changing clock.timers to a simple heap fixed this performance bottleneck.

Since Lolex is targeted primarily towards running simulations/tests, being able to run them faster will be great.

@benjamingr
Copy link
Member

Actually, timers is a hashmap (well object) and we iterate the hashmap for the first one, if it were an array and it was kept sorted then inserts would be pretty fast and firstTimer would be similarly fast.

@akhilkedia
Copy link
Author

My bad, yes timers is a Hashmap (object.)

Even if you change timers to a sorted array, while first/last and removing a timer could be fast, insertion into a sorted array would still take O(n) time (n is length of timers array) as the other elements in the array would need to be shifted.

Hence I suggest using any priority queue structure, such as a heap.

@benjamingr
Copy link
Member

@akhilkedia any interest in working on this? I can point you to the right places in the code if you'd like :)

If not I can take a stab next month probably.

@Alexsey
Copy link
Contributor

Alexsey commented Apr 19, 2018

@benjamingr do you mean that this https://github.com/sinonjs/lolex/blob/master/src/lolex-src.js#L345 is an error and it must be {} and not []?

And we may stop using ugly hasOwnProperty if timers would be created with Object.create(null)

And about delete - it is still slow so, probably, better would be to use = undefined. Especially, if we already are trying to protect ourselves from Object's prototype manipulations (usage of hasOwnProperty) then delete would not save us. But = undefined would

@benjamingr
Copy link
Member

@benjamingr do you mean that this https://github.com/sinonjs/lolex/blob/master/src/lolex-src.js#L345 is an error and it must be {} and not []?

Yeah, though given how JS works it doesn't really matter.

And we may stop using ugly hasOwnProperty if timers would be created with Object.create(null)

Also true.

And about delete - it is still slow so, probably, better would be to use = undefined.

I think there is some confusion there, "delete being slow" is a thing because of hidden classes - but for what we're doing the hashmap version generated by the delete is actually probably preferable.

Using a heap would make more sense anyway.

@akhilkedia
Copy link
Author

@benjamingr I'm kinda occupied with other projects at the moment - I needed only a small subset of lolex's features, so I made a fork of my own.
You can find it here -https://gist.github.com/akhilkedia/9c59139d9bf04bb7695085a55acde429

The critical parts are using TINY_QUEUE (which is the heap), using clock.timers.push(timer); in addTimer, timers.pop() in firstTimer.

This implementation assumes a code-flow that is valid for this subset of Lolex features, and will probably need to be amended for the full Lolex. (Depending on the APIs required from the heap, you might also consider some other heap.)

@Alexsey
Copy link
Contributor

Alexsey commented Apr 19, 2018

@benjamingr Yes, you are right, the way it's used delete is ok. I can make a PR to fix other stuff

@fatso83
Copy link
Contributor

fatso83 commented May 18, 2018

@akhilkedia do you have any code we can use to benchmark lolex? This seems like a fun task, but as it complicates the implementation we need to make sure it's worth it in terms of an actual performance improvement.

I know that the Chai projects are really focusing on performance budgets, constantly monitoring performance, so might be worth having a look at that project. This is worth an issue of its own, probably.

@benjamingr
Copy link
Member

@fatso83 I actually have a branch that uses the priorityqueue package and mostly passes the test, my current reservations about using it:

  • lolex was never slow for me in a real life use case.
  • we don't have a benchmark suite (like you said)
  • changing the implementation to a priorityqueue required substantial changes (and removing and then pushing things back to the queue because of looking for timers).
  • I use the Node.js benchmark tool (just because I'm familiar with it) to test the Peer5 test suite with both implementations and I did not notice a significant change.

I definitely think a benchmark is the right way to go about this at this point to prove the problem. The potential for breakage in code paths we don't have tested and the lack of motivating benchmarks makes this a hard sell at the moment IMO.

@fatso83
Copy link
Contributor

fatso83 commented May 18, 2018

Agreed. No changing of fundamentals until proven we are missing a huge performance opportunity (verified using a benchmark).

@akhilkedia
Copy link
Author

@fatso83 I can provide you with a sample code which roughly corresponds to my use case, which will show the majority of the time being spent in Lolex's firstTimer(), as well as the issue being fixed after shifting to a priority queue.

Whether this issue also occurs while testing Peer5 test suite (or for some other benchmark), I cannot comment. In general, the larger the number of pending timeouts (and the smaller the processing inside each individual timeout callback), the worse this issue becomes.

@fatso83
Copy link
Contributor

fatso83 commented May 18, 2018

Sample code would be great. Just comment with a gist or repo link (or email - in my profile).

@akhilkedia
Copy link
Author

Hi!
I have uploaded a sample code here -
https://github.com/akhilkedia/lolex-queue

Vanilla lolex takes 20 seconds for this sample on my PC, and a sample modified queue-based lolex uses 0.5.

@benjamingr
Copy link
Member

benjamingr commented Jun 24, 2018

@akhilkedia thank you for doing this! It is very useful.

I'm not sure the actual benchmark in https://github.com/akhilkedia/lolex-queue/blob/master/index.js provides a use case for a typical lolex app.

That said, it is much smaller than what I had in my branch (which actually transformed clock.timers from an object to a queue) - as it just keeps an additional queue to all timers.

I'm not sure why it would be that faster since firstTimerInRange still uses a for.. in on all timers and iterates all timers rather than use the queue.

This works for clock.next used by runAll but not for clock.tick which uses firstTimerInRange. I would expect firstTimerInRange to also work with the queue.

If you are willing to pursue this then I'm definitely willing to put in the work to review it and help it get merged back into lolex if we can guarantee there is no breakage in existing code.

@akhilkedia
Copy link
Author

I would suggest simply changing the entire clock.timers to a queue just to keep the code simpler.

Whenever firstTimerInRange is used, is it possible to simply replace it with firstTimer and then perform a check to see if the timer is within the range? (Based on what I can tell from the code flow, this should work right?)

If yes, replacing clock.timers with a queue is gonna be a breeze.

@benjamingr
Copy link
Member

Whenever firstTimerInRange is used, is it possible to simply replace it with firstTimer and then perform a check to see if the timer is within the range? (Based on what I can tell from the code flow, this should work right?)

Well, what I did in my fork was shift items from the queue until one is in the range from the queue and then push them back to the queue when I was done with it.

Typically - it doesn't shift anything at all because the range always checks the closest timers. I outlined some of the challenges here: #159 (comment)

The unfortunate part is that changing the underlying data structure can break millions of monthly downloads if not done vey carefully - that doesn't mean we can't do it but it means we have to be all that more careful.

@stale
Copy link

stale bot commented Aug 23, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 23, 2018
@stale stale bot closed this as completed Aug 30, 2018
@arthurwolf
Copy link

arthurwolf commented Nov 26, 2021

Anyone make any progress on this? Lolex is spending way too much time iterating over arrays, and I would love some kind of solution...

Screenshot from 2021-11-26 13-37-21

Note: Using https://github.com/akhilkedia/lolex-queue/blob/master/index.js worked PERFECTLY and exploded my performance:

Screenshot from 2021-11-26 13-50-55

@fatso83
Copy link
Contributor

fatso83 commented Nov 28, 2021

@arthurwolf Not anyone we know of 😉 As you see from this discussion we are more than willing to accept PRs, and the discussion here already outlines the general solution. We just need to test it thoroughly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants