Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider Adding Cache Layer to Reduce Requests to Trino and Improve Response Time #606

Open
digarcia78 opened this issue Jan 31, 2025 · 1 comment

Comments

@digarcia78
Copy link

We’ve noticed that multiple requests are being made to Trino, which can sometimes result in slow response times due to the need to recalculate results repeatedly. To improve performance, we would like to suggest implementing a cache layer to store intermediate results and avoid unnecessary requests to Trino for the same data.

Would love to hear if this is something that could be implemented or if there are any existing solutions in place for this. Do you have any recommendations or suggestions on how we might approach this?

@xkrogen
Copy link
Member

xkrogen commented Jan 31, 2025

Actually just two weeks ago I built something like this into the Gateway as a hackday project :) I think it's a great idea -- at LinkedIn we see the same situation as what you described, where there are many instances of duplicate queries doing redundant computation. Dashboarding solutions hooked up to Trino definitely cause this -- maybe a bunch of different people are loading the same dashboard, or you've edited one chart on a dashboard and when it refreshes it reloads all of the charts even though most of them haven't changed.

Given that Trino queries typically have very small output resultset sizes, the cache should be able to be fairly small. In my POC, I stored the results into the same DB that the GW uses for all of its other operational data. We could consider doing this only for queries with very small outputs, and use a different, more scalable storage for queries with larger result sets. One area I'm also super interested in is using the new spooling client protocol to implement this cache in a zero-copy manner, at least for clients which leverage the spooled protocol.

I think the hardest part of this is probably determining how long the cache the results for and when to invalidate the cache. In my POC I just had a fixed TTL and use the exact query text as the cache key. But I think to make this work well, we'd have to add much smarter invalidation logic. We could look for nondeterministic functions (e.g. random) or time-based functions (e.g. current_time or current_date) and treat those differently. For data sources which support it, such as Iceberg, we could use a mechanism like snapshot IDs to determine if the input data sources have changed and then invalidate the cache. Lots of interesting stuff that can be explored here!

As an even further extension, what really gets me excited is that once we're persisting the query results, we could use that to implement query-level retries in the Gateway, pulling the same functionality as Trino FTE's QUERY retry policy up to the Gateway layer to allow it to recover even from coordinator/cluster failures and do retries across clusters (including potentially retrying a query on a "larger" or differently tuned cluster).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants