-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[client-v2] Inserting compressed data (using compressed input stream) #2010
Comments
Good day, @dizider! I think option with sending SQL as query parameter is good. I will experiment with this. So in case you are reading compressed data from external source it would be great to forward it to server. What compression algorithm do you use? Thanks! |
It was supported in the old jdbc extended api |
I have already modify client, with four line change, to suit my needs (SQL in query parameter). Everything works fine but the usage is not convenient as there are more configurations settings for compression etc.
As the example shows, I read uncompressed data from source, perform some transformation and compressed it with LZ4 (ClickHouseLZ4OutputStream). The batches are quite large ~1GB of uncompressed data, so I would like to compress data before sending. Streaming data without buffering does not work for me, because I need to do retry in case of error and the source does not provide any way to read the same data twice. |
Good day, @dizider ! In 0.7.2 we introducing writer API that should solve the problem. clickhouse-java/client-v2/src/test/java/com/clickhouse/client/insert/InsertTests.java Line 553 in a599876
|
@chernser What if I need to define an insert query? Like I have a table with 50 columns and my Avro file (compressed stream) contains only 10 columns with very specific order, and the insert should be |
Thank you for quick reaction and implementation! Works well. |
Good day, @den-crane! |
Describe your feedback
The client in version 2 allows to insert data from input byte stream. It also supports inserting data in compressed format (
decompress
option). I would like to be able to insert data from already compressed data stream. This change allows to create compressed in memory batches and then send them directly.I have already looked at code and found that there is one major incompatibly with the current API - insert statement is a part of the request body and is added before each request. This is a problem because the whole body has to be compressed or not. So I think the options are to send the statement as query parameter of HTTP or to allow user defined insert statements.
Code example
Here are two examples of how I would like to use the client. The second example is my actual use case - create compressed batches in memory and then sending them. Creating these compressed batches should be more memory efficient which can be used in memory heavy applications.
DISCLAIMER: The following examples do not work with current implementation! they assume that insert query will be sent as an HTTP query parameter.
Compressed stream
Compressed batch
The text was updated successfully, but these errors were encountered: