Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not block graphql_transport_ws operations while creating context or validating a single request operation #2829

Conversation

kristjanvalur
Copy link
Contributor

@kristjanvalur kristjanvalur commented Jun 8, 2023

Now both Context creation and validation of an operation (subscripton or single-result operation) are performed
on the worker task, freeing up the Websocket for other requests.

Description

Previously both creating the context, and validating the request, were done on the same task which handles
incoming messages for the Websocket. Both of these operations are async and can potentially take a long time.
Now they are performed on the worker background Task which is created to handle the operation.

We add tests to make sure that long operations in context and validation don't block the websocket connection.

Types of Changes

  • Core
  • Bugfix
  • New feature
  • Enhancement/optimization
  • Documentation

Issues Fixed or Closed by This PR

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have tested the changes and verified that they work and don't break anything (as well as I can manage).

@codecov
Copy link

codecov bot commented Jun 8, 2023

Codecov Report

Attention: Patch coverage is 99.33333% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.53%. Comparing base (fc25e04) to head (e43aca8).
Report is 53 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2829      +/-   ##
==========================================
- Coverage   96.79%   96.53%   -0.27%     
==========================================
  Files         517      517              
  Lines       33517    33598      +81     
  Branches     5570     5573       +3     
==========================================
- Hits        32444    32433      -11     
- Misses        845      926      +81     
- Partials      228      239      +11     

@kristjanvalur kristjanvalur force-pushed the kristjan/validate-in-task branch 3 times, most recently from 33caef4 to 2008261 Compare June 8, 2023 15:34
@kristjanvalur kristjanvalur marked this pull request as ready for review June 8, 2023 15:49
@botberry
Copy link
Member

botberry commented Jun 8, 2023

Thanks for adding the RELEASE.md file!

Here's a preview of the changelog:


Operations over graphql-transport-ws now create the Context and perform validation on
the worker Task, thus not blocking the websocket from accepting messages.

Here's the tweet text:

🆕 Release (next) is out! Thanks to @kristjanvalur for the PR 👏

Get it here 👉 https://strawberry.rocks/release/(next)

@DoctorJohn
Copy link
Member

Thanks for updating @kristjanvalur, still validating the changes 👍

@kristjanvalur
Copy link
Contributor Author

PR is updated but I cannot see from the codecov that there is any missing coverage for my changes.

@kristjanvalur
Copy link
Contributor Author

This PR needs to be completely refactored because of the upstream changes.
The "context" and "root value" are now created in a different place, long before the task is created. These really need to be evaluated in the context of the Task to not block the socket. I will attempt to re-factor this again.

@DoctorJohn
Copy link
Member

This PR needs to be completely refactored because of the upstream changes. The "context" and "root value" are now created in a different place, long before the task is created. These really need to be evaluated in the context of the Task to not block the socket. I will attempt to re-factor this again.

Hi Kristján, thanks for bringing this up again!

During a brief discussion of this PR at PyconIT earlier this year, I realized that the strawberry core devs I talked to generally perceive get_context and get_root_value as lightweight methods. After recent refactors, both methods are only being called once per WebSocket connection.

Based on the assumption that these operations are lightweight, we were less concerned about get_context/get_root_value than the request validation mentioned in this PR's description. Feel free to split the request validation optimization in a separate PR so that we do not delay the merging of either part of this PR by pondering the other.

@kristjanvalur
Copy link
Contributor Author

Sounds good. I'll leave the context/root as they are and rework this to just do the validation.

@kristjanvalur
Copy link
Contributor Author

kristjanvalur commented Oct 13, 2024

It appears that validation now happens in the task ,because the result_source is awaited inside the Task now. So presumably, validation now happens as part of that, in a separate Task. Since context is now created once for each connection, this leaves nothing left for this PR to do. All it could contribute is perhaps tests that validation does not block other messages for the connection. I'm inclined to simply close this.

@kristjanvalur
Copy link
Contributor Author

kristjanvalur commented Oct 13, 2024

Actually, my tests as written for the old PR do indicate a discrepancy in validation handling for queries vs subscriptions.
I'll create a new PR with just the tests, showing the problem, without proposing a solution.
see #3671. Please have a look at this and give me your opinion.

@kristjanvalur
Copy link
Contributor Author

By the way, speaking of "context" being created once for each WS connection, I want to draw your attention to this bug: #1754

Copy link
Member

@bellini666 bellini666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My $.02 cents in here, leaving for @DoctorJohn to actually approve this once it is ready

@@ -107,7 +108,7 @@ def on_request_accepted(self) -> None:

async def handle_connection_init_timeout(self) -> None:
task = asyncio.current_task()
assert task
assert task is not None # for typecheckers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we enforce typing a lot here, assert <something> is a well known practice for type checkers, so I don't think the comment is required

Suggested change
assert task is not None # for typecheckers
assert task is not None

await operation.send_message(CompleteMessage(id=operation.id))
finally:
# Close the AsyncGenerator in case of errors or cancellation
await result_source.aclose()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be wrapped in a with suppress(asyncio.CancelledError)?

@DoctorJohn
Copy link
Member

It appears that validation now happens in the task ,because the result_source is awaited inside the Task now. So presumably, validation now happens as part of that, in a separate Task. Since context is now created once for each connection, this leaves nothing left for this PR to do. All it could contribute is perhaps tests that validation does not block other messages for the connection. I'm inclined to simply close this.

Thanks for verifying! It looks like this was changed when extension support for subscriptions was added. Let's close this PR then.

Actually, my tests as written for the old PR do indicate a discrepancy in validation handling for queries vs subscriptions.
I'll create a new PR with just the tests, showing the problem, without proposing a solution.
see #3671. Please have a look at this and give me your opinion.

Thank you so much for highlighting that. I'll check the specs and the PR introducing this discrepancy again. It looks like something slipped through the review and had no proper tests.

By the way, speaking of "context" being created once for each WS connection, I want to draw your attention to this bug: #1754

👍 Currently reading through it.

@DoctorJohn DoctorJohn closed this Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants