Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method to asynchronously prepare CQL statements #1239

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

lukasz-antoniak
Copy link
Contributor

@lukasz-antoniak lukasz-antoniak commented Dec 2, 2024

Add an option to asynchronously prepare CQL statements.

I think that proposed implementation has one difference in behaviour that I was not able to solve - after calling prepare() CQL statement was prepared on all nodes when this option was selected. After the change, when calling prepare_async(), future completes only after triggering preparation on all other nodes (operation was not completed). Maybe we should keep implementation of both functions separated and preserve synchronous behaviour.

@lukasz-antoniak
Copy link
Contributor Author

I have implemented a small fix to preserve synchronous behaviour of prepare logic in the prepare() call.

@lukasz-antoniak lukasz-antoniak marked this pull request as ready for review December 6, 2024 09:47
Copy link
Collaborator

@absurdfarce absurdfarce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth our while trying to figure out a way to implement this using something like a Future.then() syntax to return a future which does the prepare and then calls prepare_on_all_hosts() when that's successful. This will take some more plumbing to support this use case but it's a much simpler implementation and such an impl maps naturally onto this use case.

def _create_response_future(self, query, parameters, trace, custom_payload,
timeout, execution_profile=EXEC_PROFILE_DEFAULT,
paging_state=None, host=None):
def prepare_async(self, query, custom_payload=None, keyspace=None, prepare_on_all_hosts=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should live near the impl for prepare(). So either it should be moved down below prepare() or we should bring prepare() up here.

@@ -5105,6 +5130,49 @@ def __str__(self):
__repr__ = __str__


class PrepareFuture(ResponseFuture):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm extremely skeptical of the idea of extending ResponseFuture to a prepare-specific future implementation. There's a lot of functionality in ResponseFuture and we'd have to make sure everything we need there was duplicated here... and it's easy to miss things. Specifically ResponseFuture already has a lot of logic for dealing with prepare statements + responses... I'd rather find a way to re-use that and handle the prepare-on-all-hosts ops via callbacks (or perhaps something better) rather than subclass the future impl.

except Exception:
log.exception("Error preparing query on all hosts:")
return response
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems exactly right: prepare() should be implemented as prepare_async().get() once we have a good working prepare_async(). But why do we need to do the prepare_on_all_nodes() operation here? We should've already done that when future.result() completed (since it's done in PrepareFuture._set_final_result()).

# we are on event loop thread, so do not execute those synchronously
session.submit(
session.prepare_on_all_nodes,
self.query_string, self._current_host, self._keyspace)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to me to have this logic embedded within the future impl like this. Seems like this should be done when the future is created, something like:

# _create_prepare_response_future() in this impl returns a regular ResponseFuture
future = self._create_prepare_response_future(query, keyspace, custom_payload, prepare_on_all_hosts)
if prepare_on_all_hosts:
   # Hand waving about partial application here in order to pass in parameters; the point is we get to a future that
   # calls prepare_on_all_hosts() after we've received a response to our prepare here here
   future = future.then(prepare_on_all_hosts)
future._protocol_handler = self.client_protocol_handler

Problem of course is the ResponseFuture doesn't support this then() syntax. It does have native support for callbacks but that isn't the same; that's just a function that gets invoked when the operation completes. We don't return a new future that returns the result of the function defined in the then() call like you do in most future APIs.

I'm wondering if we can either (a) add something like that or (b) find a way to wrap this functionality in another future lib in order to simplify an impl like this.

@lukasz-antoniak
Copy link
Contributor Author

I totally agree with your comments. Previously I could not find another way to reuse ResponseFuture and return PreparedStatement object. Can you check current state of PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants