-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize out toString from START/TEXT/END event IRule lookup #4
Comments
Memory profiling shows roughly 84MB of memory getting allocated to String objects and their underlying char[] when running the Benchmark end-to-end. CPU profiling shows 82% of CPU time spent calling toString on the location inside of the doXXX methods. |
ORIGINAL DESIGNFor simplicity sake, the original design of SJXP used the Strings representing rule's locationPaths as well as the parser's current state as keys for hash lookups to see if there were any matches. This requires a toString call on the StringBuilder used to represent the current parser path 4x on every tag. As seen with the profiled parser data, this lead to an enormous amount of waste, even though SJXP was still performing fairly fast, it wasn't immediately obvious. NEW DESIGNTo reduce the amount of memory and CPU waste, the following design will be employed.
Lookups for matching IRules based on the parser's location will be done using Integer hashCodes instead of needing to toString() the entire location every time. |
Implementing this change dropped memory usage 50% and CPU usage significantly, unfortunately a ton of time is now spent inside of the Integer.valueOf() method auto-boxing the int returned from hashCode. IMPROVED DESIGNKeeping the same design, but instead implementing an Integer hashCode cache that is a simple Integer[] of a decent size (it's size only needs to match the number of unique paths that exist in the document... probably never more than 64, so default is 512 to be more than adequate) Whenever an Integer for the hashCode is requested, the hash code cache is checked at the index of the hash code value modulo to cache length (Absolute value), if null is found, an Integer to represent that hashcode value is created, stored and returned. We make sure we don't get a collision of values and either return the cached version or create a new instance of Integer. This way we never create more than hashCodeCache.length number of Integers regardless of how huge the document is. |
ResultsProfiling the new design and I see the following: HUGE decrease in memory usage/object creation. The top 3 usages of memory are all directly tied to the underlying XML Pull Parser and no longer occupied by SJXP. The memory usage of SJXP has been decreased to an insignificant amount ontop of the base parser. Benchmarking the "Benchmark" class with HPROF went from 85MB of memory created by SJXP down to around 1MB. CPU benchmark results show similar improvements. The overhead of SJXP is razor thin now with 1 other optimization to make it thinner on it's way :) |
Benchmark results after the optimization: Processed 10845 bytes, parsed 60 XML elements in 8ms |
Right now every time there is a START, TEXT or END event from the underlying XMLPullParser, I do a toString on the internal location to see if there are any rules with locationPaths that match.
Using HPROF to profile the code it looks like a larger amount of memory and disturbing amount of CPU is allocated to creating and destroying all those temporary strings.
The text was updated successfully, but these errors were encountered: