Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sink Writer to write in batches #42

Closed
talsheldon opened this issue Dec 18, 2022 · 11 comments · Fixed by #58
Closed

Sink Writer to write in batches #42

talsheldon opened this issue Dec 18, 2022 · 11 comments · Fixed by #58
Assignees
Labels
enhancement New feature or request

Comments

@talsheldon
Copy link

My understanding is that for every single element (from upstream) there will be an HTTP request for it.

 HttpSink.builder[..]
.setElementConverter(
  (s, _context) => new HttpSinkRequestEntry(
    "POST",
    new Gson().toJson(data).getBytes(StandardCharsets.UTF_8)))

So basically every value from upstream is 1 HttpSinkRequestEntry, which translates later into 1 HTTP request when submitRequestEntries finally executes.
I'd like to have less HTTP requests, and batch these requests to a single HTTP request (data to form a list). e.g. a single HTTP request for a batch of size maxBatchSize.

How could I achieve that?

@kristoffSC
Copy link
Collaborator

Hi @talsheldon
Thanks for the question.

I will replay to you after dec 20th, when i will be back from my vacations.

@kristoffSC kristoffSC self-assigned this Dec 18, 2022
@kristoffSC
Copy link
Collaborator

kristoffSC commented Dec 21, 2022

Hi @talsheldon

My understanding is that for every single element (from upstream) there will be an HTTP request for it.

Yes this is correct. Currently, Http Sink splits collection or requests passed from Flink Via AsyncSink/AsyncSinkWriter into individual requests.

The reason for that was, that back in that time, this was what we needed in our project. The web service that we were suppose to send requests was unable to handle "batch" requests. It simply does not understand a REST Body containing an array of individual requests.

I'd like to have less HTTP requests, and batch these requests to a single HTTP request

Yes this is something that would be a good enhancement for the connector, totally agree.

On the first glance the change we have to implement is twofold. We would have to refactor JavaNetSinkHttpClient::submitRequests method that currently is spiting colletion of elements to individual http reqeusts and also think a little bit about setElementConverter.

If you would be interested in contribution let me know.

@kristoffSC kristoffSC added the enhancement New feature or request label Dec 21, 2022
@shmilygkd
Copy link

Does the current version support this feature?

@kristoffSC
Copy link
Collaborator

Hi @shmilygkd unfortunately this feature is still not supported.

Maybe I will be able to find some time to work on it in upcoming future or maybe you would be interested in contribution?

On the first glance the change we have to implement is twofold. We would have to refactor JavaNetSinkHttpClient::submitRequests method that currently is spiting colletion of elements to individual http reqeusts and also think a little bit about setElementConverter.

If you would be interested in contribution let me know.

@shmilygkd
Copy link

Well, let's synchronize if there is progress.

@kristoffSC
Copy link
Collaborator

@shmilygkd actually I started working on this as we speak :)

@shmilygkd
Copy link

Looking forward to it!

@kristoffSC
Copy link
Collaborator

It is expected to be ready sometime next week.

@kristoffSC
Copy link
Collaborator

This will be released in 0.10.0, currently available on 0.10.0-snapshot

@kristoffSC
Copy link
Collaborator

@talsheldon @shmilygkd
I've released version 0.10.0 that contains this feature.
Feel free to try it :)

@shmilygkd
Copy link

@kristoffSC good job~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants