Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf regression in 2.7.9 #1009

Closed
klensy opened this issue Apr 13, 2024 · 6 comments · Fixed by #1013
Closed

perf regression in 2.7.9 #1009

klensy opened this issue Apr 13, 2024 · 6 comments · Fixed by #1013

Comments

@klensy
Copy link

klensy commented Apr 13, 2024

Describe the bug
Huge perf regression in 2.7.9, probably because of #965, visible in http parser and json parser benchmarks

To Reproduce
checkout 2.7.8 version of pest, cd grammars, cargo bench
checkout 2.7.9 version of pest, cd grammars, cargo bench
see the diff

Expected behavior
Not big regressions like here:

http parser             time:   [3.5245 ms 3.5328 ms 3.5464 ms]
                        change: [+1003.6% +1008.2% +1014.4%] (p = 0.00 < 0.05)
                        Performance has regressed.

json parser             time:   [215.12 µs 215.53 µs 216.20 µs]
                        change: [+919.13% +928.78% +940.25%] (p = 0.00 < 0.05)
                        Performance has regress

Additional context
One of probable slowdown spots?

pest/pest/src/parser_state.rs

Lines 1019 to 1021 in f60b518

let token = ParsingToken::Sensitive {
token: String::from(string),
};

@klensy klensy added the bug label Apr 13, 2024
@tomtau
Copy link
Contributor

tomtau commented Apr 13, 2024

@EmirVildanov do you think that can be optimized? Maybe the allocation can be removed if the token is &str instead of String?

@tomtau
Copy link
Contributor

tomtau commented Apr 13, 2024

Alternatively we can perhaps have this better error reporting as a compile time or runtime option

@chrivers
Copy link

I am also noticed this slowdown. It's very noticeable in real-world situations.

In a private development of branch rustray, I noticed this in the SBT2-parser (for ray tracer scene files).

With pest version 2.7.9:

[*]   parse .....:  2139.15 ms

Downgrading to 2.7.8 yields:

[*]   parse .....:   297.14 ms

So yeah, about 750% slower :-/

@tomtau
Copy link
Contributor

tomtau commented Apr 15, 2024

I'm guessing the main culprit may be try_add_new_stack_rule which is called often during parsing and the maintainance of that call stack may be heavy: https://github.com/pest-parser/pest/pull/965/files#diff-4bc69b84e1294b9c791fd94baf98de34748c484129ac9a6c2757babefcf862e0R274

@EmirVildanov
Copy link
Contributor

@EmirVildanov do you think that can be optimized? Maybe the allocation can be removed if the token is &str instead of String?

Yeah, I'll profile the code and try to optimize it!
Thanks for pointing at the possible culprit and sorry for introduced regression 🙏 .

@tomtau
Copy link
Contributor

tomtau commented Apr 15, 2024

Yeah, I'll profile the code and try to optimize it!
Thanks for pointing at the possible culprit and sorry for introduced regression 🙏 .

I tried to reproduce it and indeed, there's a regression. In my setup (M1 CPU), it was approx. 50% faster than what @klensy posted:

http parser             time:   [2.0941 ms 2.0948 ms 2.0960 ms]
                        change: [+500.10% +520.67% +538.10%] (p = 0.00 < 0.05)
                        Performance has regressed.

json parser             time:   [124.61 µs 124.65 µs 124.71 µs]
                        change: [+492.54% +492.97% +493.40%] (p = 0.00 < 0.05)
                        Performance has regressed.

but it's still a regression / around 5x slower than the previous version baseline.

Without profiling, a few optimization ideas besides reducing the allocations:

  • not clearing those tracking structures where possible (and instead keeping track of where last position tokens were added)
  • different data structures for lookups: e.g. instead of call_stack in self.call_stacks.iter().skip(start_index) to find the deepest rule etc., having a few refs or a HashMap (or BTreeMap) to look it up

But if optimizations can't bring the overhead down to be close to the previous baseline, I guess the best course of action may be to introduce a runtime AtomicBool flag that one can set if one wants better error reporting? I assume this is better as a runtime option (instead of a compile-time), as one may want to do potentially both: false for default parsing that has a similar performance behaviour as the previous version; true if one wants to explicitly get a better error report?

@tomtau tomtau linked a pull request Apr 30, 2024 that will close this issue
tomtau added a commit that referenced this issue May 1, 2024
* make tracking for better error details optional (fixes #1009)

* Update pest/src/parser_state.rs

* bump toolchain for msrv

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants