Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize apollo-server graceful shutdown to all integrations #5635

Merged
merged 1 commit into from
Aug 23, 2021

Conversation

glasser
Copy link
Member

@glasser glasser commented Aug 20, 2021

Previously, the batteries-included apollo-server package had a special
override of stop() which drains the HTTP server before letting the
actual Apollo Server stop() machinery begin. This meant that
apollo-server followed this nice shutdown lifecycle:

  • Stop listening for new connections
  • Close all idle connections and start closing connections as they go
    idle
  • Wait a grace period for all connections to close and force-close any
    remaining ones
  • Transition ApolloServer to the stopping state, where no operations
    will run
  • Run stop hooks (eg send final usage report)

This was great... but only apollo-server worked this way, because only
apollo-server has full knowledge and control over its HTTP server.

This PR adds a server draining step to the ApolloServer lifecycle and
plugin interface, and provides a built-in plugin which drains a Node
http.Server using the logic of the first three steps above.
apollo-server's behavior is now just to automatically install the
plugin.

Specifically:

  • Add a new 'phase' called draining that fits between started and
    stopping. Like started, operations can still execute during
    draining. Like stopping, any concurrent call to stop() will just
    block until the first stop() call finishes rather than starting a
    second shutdown process.
  • Add a new drainServer plugin hook (on the object returned by
    serverWillStart). Invoke all drainServer hooks in parallel during
    the draining phase.
  • Make calling stop() when start() has not yet completed
    successfully into an error. That behavior was previously undefined.
    Note that as of apollo-server-core: register signal handlers later and not on serverless #5639, the automatic stop() call from signal
    handlers can't happen before start() succeeds.
  • Add ApolloServerPluginDrainHttpServer to apollo-server-core.
    This plugin implements drainServer using the Stopper class
    that was previously in the apollo-server package. The default
    grace period is 10 seconds.
  • Clean up integration tests to just use stop() with the plugin
    instead of separately stopping the HTTP server. Note that for Fastify
    specifically we also call app.close although there is some weirdness
    here around both app.close and our Stopper closing the same server.
    A comment describes the weirdness; perhaps Fastify experts can improve
    this later.
  • The Hapi web framework has built in logic that is similar to our
    Stopper, so apollo-server-hapi exports
    ApolloServerPluginStopHapiServer which should be used instead of the
    other plugin with Hapi.
  • Remove some examples from READMEs and point to examples in the docs
    instead. Keeping both up to date is extra work.
  • Fix some test issues (eg, have FakeTimers only mock out Date.now
    instead of setImmediate, drop an erroneous const which made an app
    not get cleaned up, etc).

Fixes #5074.

@glasser glasser requested a review from IvanGoncharov August 20, 2021 04:21
Copy link
Member

@IvanGoncharov IvanGoncharov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glasser Code level changes look good and the idea to drain the server as a separate phase also makes a lot of sense.

glasser added a commit that referenced this pull request Aug 20, 2021
`stop()` is not designed to work properly if `start()` has not
previously succeeded (there's an XXX comment about this), and #5635 is
going to make early `stop()` calls into a hard error. This change
ensures that the signal handler won't cause that error to show up.

Also, serverless integrations don't use the same sort of process-based
shutdown as other environments (and you don't call `start` or `listen`
yourself), so registering these signal handlers isn't a great default.
(They start "listening" before the ApolloServer is started, so it would
be weird if after this change the signal handling depended on whether or
not a request had been processed or not.) Make stopOnTerminationSignals
default to false for serverless integrations.
glasser added a commit that referenced this pull request Aug 20, 2021
`stop()` is not designed to work properly if `start()` has not
previously succeeded (there's an XXX comment about this), and #5635 is
going to make early `stop()` calls into a hard error. This change
ensures that the signal handler won't cause that error to show up.

Also, serverless integrations don't use the same sort of process-based
shutdown as other environments (and you don't call `start` or `listen`
yourself), so registering these signal handlers isn't a great default.
(They start "listening" before the ApolloServer is started, so it would
be weird if after this change the signal handling depended on whether or
not a request had been processed or not.) Make stopOnTerminationSignals
default to false for serverless integrations.
glasser added a commit that referenced this pull request Aug 20, 2021
…ess (#5639)

`stop()` is not designed to work properly if `start()` has not
previously succeeded (there's an XXX comment about this), and #5635 is
going to make early `stop()` calls into a hard error. This change
ensures that the signal handler won't cause that error to show up.

Also, serverless integrations don't use the same sort of process-based
shutdown as other environments (and you don't call `start` or `listen`
yourself), so registering these signal handlers isn't a great default.
(They start "listening" before the ApolloServer is started, so it would
be weird if after this change the signal handling depended on whether or
not a request had been processed or not.) Make stopOnTerminationSignals
default to false for serverless integrations.
@glasser glasser force-pushed the glasser/drain-server branch 4 times, most recently from d47459b to 37e32c6 Compare August 23, 2021 22:18
@glasser glasser force-pushed the glasser/drain-server branch from 2c375f1 to 51a6b19 Compare August 23, 2021 23:34
@glasser glasser marked this pull request as ready for review August 23, 2021 23:35
Previously, the batteries-included `apollo-server` package had a special
override of `stop()` which drains the HTTP server before letting the
actual Apollo Server `stop()` machinery begin. This meant that
`apollo-server` followed this nice shutdown lifecycle:
- Stop listening for new connections
- Close all idle connections and start closing connections as they go
  idle
- Wait a grace period for all connections to close and force-close any
  remaining ones
- Transition ApolloServer to the stopping state, where no operations
  will run
- Run stop hooks (eg send final usage report)

This was great... but only `apollo-server` worked this way, because only
`apollo-server` has full knowledge and control over its HTTP server.

This PR adds a server draining step to the ApolloServer lifecycle and
plugin interface, and provides a built-in plugin which drains a Node
`http.Server` using the logic of the first three steps above.
`apollo-server`'s behavior is now just to automatically install the
plugin.

Specifically:
- Add a new 'phase' called `draining` that fits between `started` and
  `stopping`. Like `started`, operations can still execute during
  `draining`. Like `stopping`, any concurrent call to `stop()` will just
  block until the first `stop()` call finishes rather than starting a
  second shutdown process.
- Add a new `drainServer` plugin hook (on the object returned by
  `serverWillStart`). Invoke all `drainServer` hooks in parallel during
  the `draining` phase.
- Make calling `stop()` when `start()` has not yet completed
  successfully into an error. That behavior was previously undefined.
  Note that as of #5639, the automatic `stop()` call from signal
  handlers can't happen before `start()` succeeds.
- Add `ApolloServerPluginDrainHttpServer` to `apollo-server-core`.
  This plugin implements `drainServer` using the `Stopper` class
  that was previously in the `apollo-server` package. The default
  grace period is 10 seconds.
- Clean up integration tests to just use `stop()` with the plugin
  instead of separately stopping the HTTP server. Note that for Fastify
  specifically we also call `app.close` although there is some weirdness
  here around both `app.close` and our Stopper closing the same server.
  A comment describes the weirdness; perhaps Fastify experts can improve
  this later.
- The Hapi web framework has built in logic that is similar to our
  Stopper, so `apollo-server-hapi` exports
  `ApolloServerPluginStopHapiServer` which should be used instead of the
  other plugin with Hapi.
- Remove some examples from READMEs and point to examples in the docs
  instead. Keeping both up to date is extra work.
- Fix some test issues (eg, have FakeTimers only mock out Date.now
  instead of setImmediate, drop an erroneous `const` which made an `app`
  not get cleaned up, etc).

Fixes #5074.
@glasser glasser force-pushed the glasser/drain-server branch from 51a6b19 to a59574e Compare August 23, 2021 23:40
@glasser
Copy link
Member Author

glasser commented Aug 23, 2021

I'd like to move forward with releasing this, but it would be great to have a second round of post-merge eyes on this from @IvanGoncharov. And @StephenBarlow , you may have thoughts on the docs!

@glasser glasser enabled auto-merge (squash) August 23, 2021 23:43
@glasser glasser merged commit 69df6bf into main Aug 23, 2021
@glasser glasser deleted the glasser/drain-server branch August 23, 2021 23:43
@glasser glasser added 2021-08 size/medium Estimated to take LESS THAN A WEEK labels Aug 23, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
size/medium Estimated to take LESS THAN A WEEK
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No great pattern for doing a graceful shutdown with Apollo Server integration packages
2 participants