Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling search service (bleve) #11008

Open
jvillafanez opened this issue Feb 13, 2025 · 0 comments
Open

Scaling search service (bleve) #11008

jvillafanez opened this issue Feb 13, 2025 · 0 comments

Comments

@jvillafanez
Copy link
Member

Is your feature request related to a problem? Please describe.

A connection to the bleve index is open as soon as the search service is up. It's a read-write connection that locks out other processes (including the bleve command line) from accessing the index.
This is a problem because the service can't have any other replica because the replica won't be able to reach the index.

Describe the solution you'd like

The search service should allow some degree of scaling

Describe alternatives you've considered

Additional context

I've modified slightly the code to open the index with read-only mode, and made a little script to list the fields on the index using the read-only mode. The script can run and return results while the search service is running.
This leads to the following combinations:

  • write service + write script -> write script is locked
  • write service + read script -> read script is locked
  • read service + write script -> write script is locked
  • read service + read script -> read script can access

Proposal

The search service won't open a connection when it starts. Instead, it will open a connection every time it needs to access the index. This means that any operation will open a new connection, do whatever it needs to do, and close the connection.
For read-only operations ("search" and "doc count"), the connection must be opened in read-only mode. As long as there are only read-only connections, concurrency should be allowed and it should be possible to access from multiple sources (such as other replicas)

Assuming all operations take little time, the proposal should provide some degree of concurrency because the connections will be short-lived.
The problem is that it works fine for systems with a high proportion of reads over writes, but this is something we can't guarantee. For a system with a lot of writes, it's expected that the operations will be serialized and there won't be any concurrency.
In addition, it's unclear how the locks are handled. If unlocking isn't fair, it could lead to starvation: a write might never happen if reads keep coming

Taking into account that, with the proposal, every request will (potentially) open a new connection, the problems mentioned above should be visible on a system with a heavy load

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant