Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure fast response times #4113

Closed
jjbayer opened this issue Oct 7, 2024 · 1 comment
Closed

Ensure fast response times #4113

jjbayer opened this issue Oct 7, 2024 · 1 comment
Assignees

Comments

@jjbayer
Copy link
Member

jjbayer commented Oct 7, 2024

Background

From the beginning, Relay was designed to respond to incoming requests as fast as possible. For example, we optimistically respond with 200 even if we don't know whether a project is rate limited, and before we even deserialize the envelope.

The request handler contains one violation of this design choice: We await on an async response from the ProjectCache to check whether rate limits should be propagated to the client. The project cache is a service that uses an unbounded message queue and could thus delay the HTTP response indefinitely in case it is backlogged.

We discussed some options to resolve this:

  1. Implement a circuit breaker that skips the awaiting if some conditions are met (e.g. when a number of CheckEnvelope calls time out).
  2. Separate project cache tasks into high-priority messages that require a response ("queries") and low-priority messages that are fire-and-forget.
  3. Split the project cache service into an "observable state" component and an Addr, similar to what we did for the envelope buffer. The observable part grants read access (not write access) via an internal read-write-lock that encapsulates the project map.

We decided to implement option 3, because measurements showed that option 2 won't resolve the issue (project cache spends most of its time handling CheckEnvelope) and option 1 would only be a stop gap that delays work on the long term solution.

Implementation notes

  • Change the internal map of ProjectCache into something that allows concurrent access, e.g. RwLock or DashMap. Possibly two layers of locks, for the index/map and the project itself which is continuously updated (rate limits, config etc.).
  • Do not get_or_create_project on every HTTP request. Send a separate Prefetch message to the project cache to make sure the project is updated eventually.
  • Make sure that all message handlers in the project cache use readonly access, to reduce contention on the read-write-lock.
@Dav1dde Dav1dde self-assigned this Oct 14, 2024
@Dav1dde Dav1dde closed this as completed Nov 6, 2024
@Dav1dde
Copy link
Member

Dav1dde commented Nov 6, 2024

Done with #4199 and some follow ups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants