Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent memory leaks by not holding onto AST nodes #1073

Closed
ondrejmirtes opened this issue Apr 8, 2022 · 21 comments · Fixed by #1229
Closed

Prevent memory leaks by not holding onto AST nodes #1073

ondrejmirtes opened this issue Apr 8, 2022 · 21 comments · Fixed by #1229

Comments

@ondrejmirtes
Copy link
Contributor

Hi, I'm in the process of reducing the memory amount consumed by PHPStan, and I have some promising results.

Current stable version consumes ~850 MB when runing PHPStan on PHPStan.

After I removed the NodeConnectingVisitor (this isn't related to BetterReflection) that creates cycles between nodes by setting parent/previous/next attributes, I saved around 100 MB. So now we're at ~750 MB.

I continued to debug the problem by using php-meminfo but there was a lot of noise because of CachedParser (equivalent of MemoizingParser). So I removed parser caching to see what else holds onto AST nodes. I also removed static reflection capabilities altogether to see if there's something else in PHPStan that leaks. I consider the floor of what's possible to achieve at around ~200 MB.

When I returned static reflection capabilities, it went back to ~750 MB. And most of the memory is held in AST nodes held by various reflection classes like ReflectionClass etc.

So I refactored some classes to extract all of the needed information in constructor so that the AST node is not saved in a property: ondrejmirtes/BetterReflection@cc581ec...35db84f (the work isn't done, there's still more to do)

Right now PHPStan consumes around 340 MB without a CachingParser and around 430 MB with a CachingParser.

The best part is that there's very little performance impact time-wise so it looks really promising. I don't have production-ready code yet as there are some problems to work out but I hope this will make PHPStan much more resource-effective.

One downside is that I had to remove some features that need the AST after a Reflection class is created - it can be seen in the diff.

Is this something you'd like to do here too?

@Ocramius
Copy link
Member

Ocramius commented Apr 8, 2022

AFAIK, BR nodes are acyclic: the CachedParser was introduced on purpose, to prevent double-parsing of data (which happens a lot when jumping around a codebase, since we don't cache reflected symbols either).

IMO, a better approach would be to make the codebase as @immutable as possible:

  • ReflectionClass
  • ReflectionProperty
  • ReflectionType
  • ...

Besides some rare edge cases in their API, we can likely make these completely immutable, by deferring creation of stuff that requires I/O to the reflector itself.

This may potentially change the reflection API radically though 🤔

I suppose the code removals you mentioned are mostly around getAst()?

@ondrejmirtes
Copy link
Contributor Author

BR nodes are acyclic

It might be. What I want to solve is when MemoizingReflector is involved, Reflection instances aren't freed from memory on purpose, but alongside they're pulling the AST nodes which also cannot be freed. So we have caching of Reflection instances which is a good thing, but the memory occupied is more than it could be - we could hold the same amount of Reflection information and free some memory.

What's also interesting that with my refactoring, there's also no difference in performance and occupied memory with gc_disable() called or not - everything is freed with refcounting and garbage collector doesn't have anything to do, which is what I meant with my tweet a few days ago (https://twitter.com/OndrejMirtes/status/1510904762711089157).

What I also do in PHPStan successfully and need to try to do it on various Memoizing* things in BR is to cap the cache size at certain limit. Same as with server-side development, it's not wise to treat cache as unlimited resource, so I evict the oldest items from the cache with this array_slice trick in my CachedParser: https://github.com/phpstan/phpstan-src/blob/6d25967ca2b174895bb8e40f7bcffad874d88d6a/src/Parser/CachedParser.php#L33-L42

I suppose the code removals you mentioned are mostly around getAst()?

Yeah. @kukulich noted we don't have to lose those features - we still have LocatedSource and the AST can be recreated if necessary.

@Ocramius
Copy link
Member

Ocramius commented Apr 9, 2022

Overall, I'm on board with changing library performance, but not as hacks/shuffling, but rather by sitting back and checking architecture with more care.

I'll put a new major as milestone, as I feel like such a feature/improvement is also leaving us space for major BC breaks.

@ondrejmirtes
Copy link
Contributor Author

Alright, so what do you think about removing the methods that won't have the data anymore, like getAst() and getDeclaringNamespaceAst()? I don't think I removed anything else, checking my diff at ondrejmirtes/BetterReflection@cc581ec...e594f06.

Also - it's still useful to keep the value expr of constants and default expr of parameters and properties for later access by library users. Also it's useful because it's beneficial to have CompileNodeToValue evaluated lazily, otherwise you run into infinite recursion problems in constructors.

Do you agree? @kukulich is interested in starting doing this :)

@Ocramius
Copy link
Member

Ocramius commented Jun 2, 2022

They would most likely be removed in an immutable version of the library.

We can also decide to drop them in a new major (without going through the architectural change).

@Ocramius
Copy link
Member

Ocramius commented Jun 2, 2022

Also: OK with dropping problematic functionality 👍

@kukulich
Copy link
Collaborator

Ok, it looks I will prepare PR with removing getAst() (and dependant) methods first.

@Ocramius
Copy link
Member

Only ReflectionFunction* left, AFAIK!

@kukulich
Copy link
Collaborator

Unfortunatelly also ReflectionClass, ReflectionMethod, ReflectionEnum and ReflectionObject :)

@Ocramius
Copy link
Member

Ah, we reduced our MT score while working on this:

Error: ] The minimum required MSI percentage should be 97%, but actual is       
         96.87%. Improve your tests!     

Oh well 🤷

@kukulich
Copy link
Collaborator

Ah, we reduced our MT score while working on this:

Will look at it later.

@ondrejmirtes
Copy link
Contributor Author

What I'm missing in the current PRs are getters for the underlying expression values. These were previously accessible via the removed getAst() methods which were removed at my request, but the reflection classes should still give access to things like:

  • Constant value expression
  • Class constant value expression
  • Property default expression
  • Parameter default expression
  • Attributes arguments expressions
  • Enum case value expression

These are already saved in private properties because they're needed for CompileNodeToValue in the lazy getValue() etc. methods.

I need them because PHPStan completely avoids using CompileNodeToValue. Previously it used CompileNodeToValue to go "Expr -> runtime value -> Type" but it's no longer possible because these expressions can contain new and enum case references which means that obtaining runtime value relies on autoloading.

PHPStan now uses its own code (https://github.com/phpstan/phpstan-src/blob/b0babd082514303ec62deb5f16cf330f845c1821/src/Reflection/InitializerExprTypeResolver.php) to go "Expr -> Type" directly.

@Ocramius Are you fine with @kukulich adding these getters? Thank you.

@Ocramius
Copy link
Member

Yes, it's OK to add them, but beware that we still cannot support objects as default values, such as in function foo($bar = new Baz()) {} 🤔

@ondrejmirtes
Copy link
Contributor Author

This code is fine as long as the user doesn't call getValue() 😂

@Ocramius
Copy link
Member

@ondrejmirtes could you create a rough issue to track that? What API you want / how it should look like?

@Ocramius
Copy link
Member

BTW, getAst() needs to be removed from docs too

@ondrejmirtes
Copy link
Contributor Author

Done: #1222

@Ocramius
Copy link
Member

@Ocramius
Copy link
Member

@ondrejmirtes waiting for your feedback for a 6.0.0 release BTW :)

@ondrejmirtes
Copy link
Contributor Author

Awesome, please give me a few days, I need to rebase my fork and make PHPStan work with that, after that I'll make some memory measurements to make sure it's the same or better :)

@Ocramius
Copy link
Member

Creating a signpost issue for that 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants