Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of location data compensation #5294

Conversation

helixbass
Copy link
Collaborator

Fixes #5290

As discussed in that issue, performance for files with lots of "location data compensations" (eg stripped carriage returns, so all files on Windows) was blowing up because the algorithm used for calculating location data compensation was very inefficient

So this PR improves that algorithm to one that shouldn't have significant performance implications

To test, I made a copy of a decent-size source file eg src/lexer.coffee and converted it to have Windows-style line endings (\r\n), and ran bin/coffee -b -p [converted-filename.coffee]

Before this fix, compilation was taking many seconds, and with this fix, it took less than a second

I haven't done specific performance comparisons of a file with Windows line endings when you omit the location data compensation calculation entirely vs with this new algorithm, that might be interesting as a baseline

end += length
compensation
current = start
while current <= end
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just changes the iteration when calculating the location data compensations for a given start/end range to iterate on start..end instead of iterating on the entire @locationDataCompensations data structure every time

@GeoffreyBooth
Copy link
Collaborator

GeoffreyBooth commented Jan 27, 2020

So you don't need Windows to test this, just a file with Windows line endings. To make things a bit more painful, I created a copy of nodes.coffee with Windows endings:

time cat src/nodes.coffee | coffee --stdio --compile > /dev/null
cat src/nodes.coffee  0.00s user 0.00s system 4% cpu 0.081 total
coffee --stdio --compile > /dev/null  2.21s user 0.17s system 158% cpu 1.500 total

 ✦ perl -p -e 's/\n/\r\n/' < src/nodes.coffee > test.coffee # https://superuser.com/a/71509/138751time cat test.coffee | coffee --stdio --compile > /dev/null
cat test.coffee  0.00s user 0.00s system 4% cpu 0.083 total
coffee --stdio --compile > /dev/null  248.14s user 1.87s system 102% cpu 4:04.77 total

 ✦ git checkout --track helixbass/iss5290-slow-location-data-compensation
Branch 'iss5290-slow-location-data-compensation' set up to track remote branch 'iss5290-slow-location-data-compensation' from 'helixbass'.
Switched to a new branch 'iss5290-slow-location-data-compensation'time cat test.coffee | ./bin/coffee --stdio --compile > /dev/null
cat test.coffee  0.00s user 0.00s system 2% cpu 0.127 total
./bin/coffee --stdio --compile > /dev/null  2.21s user 0.17s system 149% cpu 1.585 total

So in this test, compiling a version of nodes.coffee with Windows line endings dropped from four minutes in 2.5.0 to 1.5 seconds using this branch, the same amount of time it took to compile the Unix line endings version. Seems like a success to me.

@GeoffreyBooth GeoffreyBooth merged commit 6fe980e into jashkenas:master Jan 27, 2020
@helixbass helixbass deleted the iss5290-slow-location-data-compensation branch January 27, 2020 17:30
@GeoffreyBooth GeoffreyBooth mentioned this pull request Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: 2.5.0: Compilation of files with Windows line endings very slow
2 participants