-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup object creation #14091
base: master
Are you sure you want to change the base?
Speedup object creation #14091
Conversation
0c6b3c5
to
c05fb0c
Compare
In preparation for moving them to the class, make the operator functions binary. Adjust the lambdas for trivial operators, and store unbound methods for non-trivial ones. Signed-off-by: Paolo Bonzini <[email protected]>
InterpreterObject, MesonInterpreterObject cannot be used directly, as they contain nothing that the user can use. Mark them as abstract. Likewise for MutableInterpreterObject, which is a mixin. Signed-off-by: Paolo Bonzini <[email protected]>
Do not call update() and Enum.__hash__ a gazillion times; trivial operators are the same for every instance of the class. Introduce the infrastructure to build the MRO-resolved operators (so the outcome same as if one called super().__init__) for each subclass of InterpreterObject. Signed-off-by: Paolo Bonzini <[email protected]>
Do not call update() and Enum.__hash__ a gazillion times; operators are the same for every instance of the class. In order to access the class for non-trivial operators, the operators are first marked using a decorator, and then OPERATORS is built via __init_subclass__. Signed-off-by: Paolo Bonzini <[email protected]>
Do not call update() and Enum.__hash__ a gazillion times; operators are the same for every instance of the class. In order to access the class, just mark the methods using a decorator and build METHODS later using __init_subclass__. Non-primitive objects are not converted yet to keep the patch small. They are created a lot less than other objects, especially strings and booleans. Signed-off-by: Paolo Bonzini <[email protected]>
c05fb0c
to
33367f5
Compare
More timings from QEMU's meson setup:
The main remaining hotspot for QEMU are still flatten_object_list/_determine_ext_objs, especially object_filename_from_source and canonicalize_filename, and generate_single_compile. determine_ext_objs is roughly 10% and canonicalize_filename is about half of it. However QEMU is a bit special and probably uses these more than anyone else. The remaining lower hanging fruit:
Some harder possibilities:
Focusing on such small percentage may seem weird, but it depends on how many of them you pile up. And after all, today's 1% was yesterdays's 0.5%. As more optimizations are performed one has to focus on the smaller ones. And the places that are not interesting IMO:
|
When building QEMU, about 10% of the time is spent in the various
__init__
functions for InterpreterObject subclasses, especially primitives:(Of the calls to
Enum.__hash__
, 1530823 come from the same__init__
functions; most of the others come fromInterpreterObject.operator_call
).This is because the method and operator dictionaries are rebuilt from scratch for every object. Each string creation is about 100 microseconds, but strings as well as other objects add up quickly due to
_holderify
calls.Move operators and methods to a class attribute instead. In the case of method I am only doing so for primitives to keep the pull request smaller, but it is possible (and saves a handful of lines of code) to do this for all objects.