-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak in ApolloServerPluginUsageReporting #6983
Comments
I do agree that reportDataByExecutableSchemaId is a bit of a memory leak. The state of the relationship between Gateway and Server doesn't make it completely trivial to determine when we're done with a particular iteration of the schema (since in-flight operations can continue to execute for a while after the new schema shows up). On the other hand, it seems like it shouldn't be a particularly large memory leak: since the Reports in the ReportData are reset every time reports are sent then I'd expect the leak to only be a relatively small data structure (a ReportHeader and an empty OurReport) for each time your schema is changed. So I'd be happy to try to improve the plugin to remove things from the data structure instead of leaving them around indefinitely, but I am a bit curious to hear what the scale of the leak you're seeing is, in case this isn't the actual problem. Have you used heap profiling tools? |
With the default In terms of isolating this leak, I have spent the last weeks/months stripping everything non-essential out of this service to the point that it now only contains the gateway server, usage reporting plugin, and some stateless jwt validation code. These are extremely locked down production servers so it is not going to be possible to generate heap dumps for profiling. It is also not really practical for us to replicate our entire federated graph locally and generate requests with enough variance and throughput to easily reproduce this behaviour. |
While I'm skeptical that this particular data structure (vs something else
in usage reporting) would create such a noticeable leak, the best way to
evaluate that would be to fix it :)
What version of Apollo Server are you currently using?
…On Wed, Oct 5, 2022, 1:56 AM Will Albone ***@***.***> wrote:
With the default fieldLevelInstrumentation: true we were seeing memory
usage increasing by 40-90mb per day. Having dropped that down to fieldLevelInstrumentation:
0.1 the rate of increase has dropped to around 7-25mb per day.
In terms of isolating this leak, I have spent the last weeks/months
stripping everything non-essential out of this service to the point that it
now only contains the gateway server, usage reporting plugin, and some
stateless jwt validation code. These are extremely locked down production
servers so it is not going to be possible to generate heap dumps for
profiling. It is also not really practical for us to replicate our entire
federated graph locally and generate requests with enough variance and
throughput to easily reproduce this behaviour.
—
Reply to this email directly, view it on GitHub
<#6983 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAECVBHXJJJGRPQDKLIK4DWBU7ELANCNFSM6AAAAAAQ3SAVYQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
3.10.2 |
A data structure in the usage reporting plugin would get a new entry each time the server's schema changed. Old entries would never be deleted, though their values would be shrunk to a minimum size. It seems that this was not good enough and is producing notable memory leaks for some folks. So let's fix it! The reason for the old behavior was to be really careful to never write to a report that may have already been taken out of the map and sent. Previously this was accomplished by having the main entry in the map never change. Now this is accomplished by making sure that we never write to the report any later than immediately (and synchronously) after we pull it out of the map. Fixes #6983.
A data structure in the usage reporting plugin would get a new entry each time the server's schema changed. Old entries would never be deleted, though their values would be shrunk to a minimum size. It seems that this was not good enough and is producing notable memory leaks for some folks. So let's fix it! The reason for the old behavior was to be really careful to never write to a report that may have already been taken out of the map and sent. Previously this was accomplished by having the main entry in the map never change. Now this is accomplished by making sure that we never write to the report any later than immediately (and synchronously) after we pull it out of the map. Fixes #6983.
A data structure in the usage reporting plugin would get a new entry each time the server's schema changed. Old entries would never be deleted, though their values would be shrunk to a minimum size. It seems that this was not good enough and is producing notable memory leaks for some folks. So let's fix it! The reason for the old behavior was to be really careful to never write to a report that may have already been taken out of the map and sent. Previously this was accomplished by having the main entry in the map never change. Now this is accomplished by making sure that we never write to the report any later than immediately (and synchronously) after we pull it out of the map. Fixes #6983.
A data structure in the usage reporting plugin would get a new entry each time the server's schema changed. Old entries would never be deleted, though their values would be shrunk to a minimum size. It seems that this was not good enough and is producing notable memory leaks for some folks. So let's fix it! The reason for the old behavior was to be really careful to never write to a report that may have already been taken out of the map and sent. Previously this was accomplished by having the main entry in the map never change. Now this is accomplished by making sure that we never write to the report any later than immediately (and synchronously) after we pull it out of the map. Fixes #6983. Backport from #6998.
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to version-4, this PR will be updated.⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ `version-4` is currently in **pre mode** so this branch has prereleases rather than normal releases. If you want to exit prereleases, run `changeset pre exit` on `version-4`.⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ # Releases ## @apollo/[email protected] ### Patch Changes - Updated dependencies \[[`233b44eea`](233b44e)]: - @apollo/[email protected] ## @apollo/[email protected] ### Patch Changes - [#6998](#6998) [`233b44eea`](233b44e) Thanks [@glasser](https://github.com/glasser)! - Fix a slow memory leak in the usage reporting plugin (#6983). Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
A data structure in the usage reporting plugin would get a new entry each time the server's schema changed. Old entries would never be deleted, though their values would be shrunk to a minimum size. It seems that this was not good enough and is producing notable memory leaks for some folks. So let's fix it! The reason for the old behavior was to be really careful to never write to a report that may have already been taken out of the map and sent. Previously this was accomplished by having the main entry in the map never change. Now this is accomplished by making sure that we never write to the report any later than immediately (and synchronously) after we pull it out of the map. Fixes #6983. Backport from #6998.
@willalbone Why don't you give v3.10.3 a shot and see if it helps. |
Package: apollo-server-core@^3.10.2
We are running a federated gateway with an instance of the ApolloServerPluginUsageReporting. Over time we are seeing heap memory of these servers steadily increasing. As of 30 Sept we changed the
fieldLevelInstrumentation
option passed to the plugin from the default 100% down to 10%. This has seen a decrease in the amount of memory being used by our federated gateway however it is continuing to increase over time.I believe there is an issue with the
getReportData
function used internally within the usage reporting pluginapollo-server/packages/apollo-server-core/src/plugin/usageReporting/plugin.ts
Line 196 in 0e42833
The function appears to be storing ReportData objects in a plain object
Record<string, ReportData>
but does not seem to ever remove entries from this object. I would expect these entries to be removed once the reports have been sent to Apollo Studio.Apologies for the lack of a repro case, I have not been able to pull something together that is able to replicate what we are seeing in our production environments
The text was updated successfully, but these errors were encountered: