Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 649: Avoid creation of function objects for __annotate__ #124157

Open
JelleZijlstra opened this issue Sep 17, 2024 · 2 comments
Open

PEP 649: Avoid creation of function objects for __annotate__ #124157

JelleZijlstra opened this issue Sep 17, 2024 · 2 comments
Assignees
Labels
3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-typing

Comments

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Sep 17, 2024

Currently, when a class, function, or module has any annotations, we always generate an __annotate__ function object at import time. A function object takes 168 bytes. But in most cases, all of the relevant fields on an __annotate__ function are predictable (there's no docstring and no defaults or kwdefaults, the name is __annotate__, etc.). So we could save significant memory by constructing only a smaller object and constructing the function on demand when somebody asks for it (by accessing __annotate__).

We need the following to create an __annotate__ function object:

  • The code object itself. That's inescapable.
  • The globals dict. For function annotations, we can reuse the function's globals. For module annotations, we can use the module dict. But for classes, the __annotate__ descriptor can't easily get to the globals dict. To do this, we may need a new bytecode that just loads the current globals.
  • The closure tuple. Module annotations never have this, classes always have it (a reference to the classdict), functions often have it (always for methods, never for global functions, often for nested functions).

I am thinking of a format where __annotate__ can be any of the following:

  • A function, like today
  • A bare code object
  • A tuple containing a code object at position 0, optionally a globals dict at position 1, plus any number of cell objects

__annotate__ getters would have to recognize the second and third cases and translate them into function objects on the fly. As a result, users accessing .__annotate__ would never see the tuple, though those who peek directly into a module or class's __dict__ might.

Other related opportunities for optimization:

  • Tools like functools.wraps would unnecessarily force materialization of the __annotate__ function. Not sure there's an elegant solution for this.
  • The function objects created for various PEP 695/696 objects (e.g., TypeVar bounds) work very similarly to annotate functions, and we could apply the same optimization to them.
  • A code object by itself is also pretty big (232 bytes), and many of its fields are not needed for an annotate function that may never get executed. We could internally create a more streamlined "mini-codeobject" and materialize the real code object only when necessary.

Linked PRs

@JelleZijlstra JelleZijlstra added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-typing 3.14 new features, bugs and security fixes labels Sep 17, 2024
@JelleZijlstra JelleZijlstra self-assigned this Sep 17, 2024
@JelleZijlstra
Copy link
Member Author

We could internally create a more streamlined "mini-codeobject" and materialize the real code object only when necessary

Maybe a variation of this could be that we create a bytestring with the marshalled code object, and unmarshal it only on demand.

@JelleZijlstra
Copy link
Member Author

Tools like functools.wraps would unnecessarily force materialization of the annotate function. Not sure there's an elegant solution for this.

The proposal in #124342 would actually fix this, because we make it so the wrapper accesses the wrapped function's .__annotate__ lazily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-typing
Projects
None yet
Development

No branches or pull requests

1 participant