-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming query does not work in release mode. #6
Comments
I've had a look at LZ4 in ClickHouse and found an extensive blogpost about compression. https://clickhouse.com/blog/lz4-compression-in-clickhouse From the post: "Our conclusion is thus that it's always preferable to store data in a compressed format. Always write data to disk in compressed format. Transmit data over the network with compression, as well. " Apparently, they only added the no compression option for a very specific corner case. The bigger question is: Why does streaming query on Port 3128 does not work with compression? Doesn't Proton compresses outgoing streaming data by default or is it related to how the client decompresses the incoming data? In the first case, something need to be worked out in Proton whereas in the second case I have to talk to the folks maintaining the underlying ClickHouse client. For the time being, I simply patch in a new constructor for streaming that sets the port correctly and disables compression. This is a workaround until the root cause of the issue has been identified and hopefully resolved. |
The key difference here is, for streaming query, the returned result is a unbounded stream, and this is totally a new thing introduced by Proton, it does not exists in ClickHouse. Compressing / de-compressing a stream is quite different from compressing / de-compressing the whole response ( for non-streaming queries ). There are quite some details need to be designed on how we want the compression should be implemented. Let's put the complexity aside for now, once implement the compression on streaming, we need to implement the de-compression logic on the client side as well. And this unlikely will be part of the But yes, for now, we need to work around the issue first. I think a new constructor probably is not the best solution, because it could be troublesome to construct different clients for different use cases. IMO, maybe keeping two pub fn new(url: &str) -> Self {
let client = Client::default().with_url(url.to_string());
let client_without_compression = client.clone().with_compression(clickhouse::Compression::None);
let url = url.to_string();
Self { client, client_without_compression, url }
} Then we introduce a function pub async fn fetch_streaming<T: Row>(&self, query: &str) -> Result<RowCursor<T>> {
match self.client_without_compression.query(query).fetch::<T>() {
Ok(cursor) => Ok(cursor),
Err(e) => Err(ProtonClientError::FetchFailed(e.to_string())),
}
} |
That's cool.
Thank you for explaining the underlying rationale.
I'm going to update the normal constructor as you suggested, add a fetch
stream against the uncompressed client, and submit a PR a bit later today
to get this out of the way.
Once the release bot has issued a new version, it would be be great if you
could do some testing in release mode to double check if the issue is gone
with this workaround.
I think in the long run I have to come up with something else but let's
figure that out later.
…On Wed, Feb 14, 2024 at 1:29 PM Gimi Liang ***@***.***> wrote:
Why does streaming query on Port 3128 does not work with compression?
The key difference here is, for streaming query, the returned result is a
unbounded stream, and this is totally a new thing introduced by Proton, it
does not exists in ClickHouse. Compressing / de-compressing a stream is
quite different from compressing / de-compressing the whole response ( for
non-streaming queries ). There are quite some details need to be designed
on how we want the compression should be implemented.
Let's put the complexity aside for now, once implement the compression on
streaming, we need to implement the de-compression logic on the client side
as well. And this unlikely will be part of the clickhouse crate, because,
again, this is not a clickhouse feature, but a Proton one. So this will be
implemented in this crate I believe.
But yes, for now, we need to work around the issue first. I think a new
constructor probably is not the best solution, because it could be
troublesome to construct different clients for different use cases. IMO,
maybe keeping two clickhouse::Client inside ProtonClient is a better way.
Something like this:
pub fn new(url: &str) -> Self {
let client = Client::default().with_url(url.to_string());
let client_without_compression = client.clone().with_compression(clickhouse::Compression::None);
let url = url.to_string();
Self { client, client_without_compression, url }
}
Then we introduce a function fetch_streaming for streaming queries:
pub async fn fetch_streaming<T: Row>(&self, query: &str) -> Result<RowCursor<T>> {
match self.client_without_compression.query(query).fetch::<T>() {
Ok(cursor) => Ok(cursor),
Err(e) => Err(ProtonClientError::FetchFailed(e.to_string())),
}
}
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFYR7XFTS7MLONZK5GANXL3YTRDS7AVCNFSM6AAAAABDHQNP72VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBTGEYDOOBYGY>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
Thank you both for looking into the details and refine the proton rust client |
PR #7 resolves this issue. When @jovezhong has merged both, PR #7 and PR #5 , a new version 0.1.1. should be available on crates.io within a few minutes after the release bot completed. Please test this patched version to verify if the issue has been resolved. If you encounter any problem, please report back here. Otherwise, feel free to close the issue. @jovezhong I think its best to push back the blog post to next week so that I have a bit more time to write some meaningful code examples and update the post accordingly. While the current workaround might be imperfect, at least it should resolve this issue for the time being. Details for better and more future proof solution can be discussed in Slack. I will take another look once I am back in Office on Monday, |
* Updated Readme. Signed-off-by: Marvin Hansen <[email protected]> * Enabled relase-plz Signed-off-by: Marvin Hansen <[email protected]> * Added client without compression to resolve issue with handling streaming data in release mode. Added new method fetch_stream to ,well, fetch data from a continuous query of a stream. Signed-off-by: Marvin Hansen <[email protected]> * Fixed Docstring test in fetch_stream. Signed-off-by: Marvin Hansen <[email protected]> * Update src/lib/stream.rs Co-authored-by: Gimi Liang <[email protected]> * Update src/lib/mod.rs Co-authored-by: Gimi Liang <[email protected]> --------- Signed-off-by: Marvin Hansen <[email protected]> Co-authored-by: Jove Zhong <[email protected]> Co-authored-by: Gimi Liang <[email protected]>
Describe what's wrong
Proton has two sets of HTTP APIs, one is running on port
8123
, this is the same as clickhouse. And the other is running on port3218
which supports streaming queries. Simply put, the same querySELECT * FROM some_stream
behaves differently on these two APIs. On port8123
it returns immediately, but on port3128
, it will keep running and returnning fresh data.This isssue is about running streaming query on port
3128
with this library. It won't work in release mode (e.g. usingcargo run --release
). The reason why it worked for non-release mode is because of this:with the
test-util
, theclickhouse
crate does not use any compression by default (otherwise, it useslz4
), this is the tricky part. And withlz4
, streaming query does not work. So, in order to make streaming query work, we need to disable compression when do streaming query:How to reproduce
Error message and/or stacktrace
Additional context
The text was updated successfully, but these errors were encountered: