-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: cdc/kafka-auth failed #118525
Comments
The kafka log file contains a bunch of failure messages like below:
|
I was able to reproduce this on master pretty consistently. Likely due to #117544. |
Removing release blocker since it seems to be a test issue. It works on cockroach binary but not on roachtests.
|
Likely the same issue as https://cockroachlabs.slack.com/archives/C065X5307U3/p1702915552046409 but it is now surfacing up after the upgrade. |
roachtest.cdc/kafka-auth failed with artifacts on master @ cc4fdffa8532d16544c48ef036689763f737dc6b:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ fce4d4723519bc4ca6e9ef5da0ae19960c84752c:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 15961a19faca0e2b66df2d01a547549523ca70c7:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 3c41c509a87cba7a1fd3f5cfdb0f6badb78e3704:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ d272e9ef5589deff570efc023db6c70edfde311c:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 715628abd134abfd2c0d966f9b7220a6715cc299:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ d7d442e4a3c9dca7e01c4c6f4f00e2f28faa4374:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 7042601857042a057b1d4676735576cfbd37f36a:
Parameters:
|
From kafka 2.0 onwards, host name verification of servers is enabled by default. ssl.endpoint.identification.algorithm defaults to `https` which validates server host name to match the host name in the certificate. This patch fixes the failure by pre-pending https to the sink connection URL. Fixes: cockroachdb#118525 Release note: none
roachtest.cdc/kafka-auth failed with artifacts on master @ 353fded9fe270b3eee4c85480ac1b9ec819f23b0:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ b2e31876366324c2ebe5c2ad8bbd644997e90864:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ b2e31876366324c2ebe5c2ad8bbd644997e90864:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ b2e31876366324c2ebe5c2ad8bbd644997e90864:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 814a375d4c0e79d875c42452725f05f6c27294e3:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 254dbd247fb8ed352a11439063b29f23a0767f28:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ cc6ca026319024800395293b0fb18f05dd8eb50e:
Parameters:
|
From [kafka 2.0](https://kafka.apache.org/20/documentation.html#security_confighostname) onwards, host name verification of servers is enabled by default. This means that the "fake" certificate we generate and use for kafka-auth is no longer valid and missing the `DNSNames` field. Since then, the verification had been failing. But this error message was never surfaced back to us until sarama upgrade happened. This patch fixes the failure by adding the missing fields in the certificate. Test history 1. Kafka-auth was working as expected. In this test, we generate and pass "fake" certificates for inter-broker communication within the Kafka cluster. 2. Some changes were made in the java environment or kafka cluster (https://kafka.apache.org/20/documentation.html#security_confighostname), resulting in hostname verification which wasn't previously enforced. This means that the "fake" certificate we generated before is no longer valid and missing the `DNSNames` field. Since then, we’ve always been getting an error message in our kafka server logs. But this error was never surfaced up in sarama code during Dial() AND kafka-auth only checks the success of the CREATE stmt but not emitting messages. So our test has always been passing. 3. Sarama upgrade changed how Dial() works and is now invoking some untouched kafka code and surfacing the error. Overall, this issue pertains to test misconfiguration and not directly user-facing. But the sarama upgrade may lead to similar issues for customers due to the wide possibilities of kafka configurations. In this case, we don't think a release note is necessary because customers should have encountered this error message. This issue has been around for a while and should be surfaced once the customer uses anything beyond Dial() - when they try to emit messages to kafka sink. Fixes: cockroachdb#118525 Release note: none
From [kafka 2.0](https://kafka.apache.org/20/documentation.html#security_confighostname) onwards, host name verification of servers is enabled by default. This means that the "fake" certificate we generate and use for kafka-auth is no longer valid and missing the `DNSNames` field. Since then, the verification had been failing. But this error message was never surfaced back to us until sarama upgrade happened. This patch fixes the failure by adding the missing fields in the certificate. Test history 1. Kafka-auth was working as expected. In this test, we generate and pass "fake" certificates for inter-broker communication within the Kafka cluster. 2. Some changes were made in the java environment or kafka cluster (https://kafka.apache.org/20/documentation.html#security_confighostname), resulting in hostname verification which wasn't previously enforced. This means that the "fake" certificate we generated before is no longer valid and missing the `DNSNames` field. Since then, we’ve always been getting an error message in our kafka server logs. But this error was never surfaced up in sarama code during Dial() AND kafka-auth only checks the success of the CREATE stmt but not emitting messages. So our test has always been passing. 3. Sarama upgrade changed how Dial() works and is now invoking some untouched kafka code and surfacing the error. Overall, this issue pertains to test misconfiguration and not directly user-facing. But the sarama upgrade may lead to similar issues for customers due to the wide possibilities of kafka configurations. In this case, we don't think a release note is necessary because customers should have encountered this error message. This issue has been around for a while and should be surfaced once the customer uses anything beyond Dial() - when they try to emit messages to kafka sink. Fixes: cockroachdb#118525 Release note: none
Summary:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 7d0697b632066ee78735fc57e8150222d5576d0d:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ 0b7ae19e2b94b851ed8812914f57032aab699811:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ e39dafe6d8c153301ff43ed2b3ed3e13af9ec72a:
Parameters:
|
roachtest.cdc/kafka-auth failed with artifacts on master @ e39dafe6d8c153301ff43ed2b3ed3e13af9ec72a:
Parameters:
|
119077: roachtest/cdc: fix cdc/kafka-auth r=stevendanna a=wenyihu6 From [kafka 2.0](https://kafka.apache.org/20/documentation.html#security_confighostname) onwards, host name verification of servers is enabled by default. Previously, the self-signed test certificate we generated for kafka-auth only included “localhost” in the list of subject alternative names. However, kafka appears to make internal connections using the fully qualified domain name. As a result, some inter-broker communication has been failing with a hostname verification error for some time. But the failure wasn’t raised to the user until the sarama upgrade happened. This patch fixes the failure by adding the proper hostname of the kafka node to the certificate. We don’t believe this represents a meaningful customer-facing issue. The misconfiguration of the test kafka cluster would have surfaced even with older sarama versions if the test had involved more than just connecting to the kafka cluster. Fixes: #118525 Release note: none Co-authored-by: Wenyi Hu <[email protected]>
From [kafka 2.0](https://kafka.apache.org/20/documentation.html#security_confighostname) onwards, host name verification of servers is enabled by default. Previously, the self-signed test certificate we generated for kafka-auth only included “localhost” in the list of subject alternative names. However, kafka appears to make internal connections using the fully qualified domain name. As a result, some inter-broker communication has been failing with a hostname verification error for some time. But the failure wasn’t raised to the user until the sarama upgrade happened. This patch fixes the failure by adding the proper hostname of the kafka node to the certificate. We don’t believe this represents a meaningful customer-facing issue. The misconfiguration of the test kafka cluster would have surfaced even with older sarama versions if the test had involved more than just connecting to the kafka cluster. Fixes: cockroachdb#118525 Release note: none
roachtest.cdc/kafka-auth failed with artifacts on master @ ed3a25e3c9459cede2f80babbfc9d44a836b6c12:
Parameters:
ROACHTEST_arch=amd64
ROACHTEST_cloud=gce
ROACHTEST_coverageBuild=false
ROACHTEST_cpu=4
ROACHTEST_encrypted=false
ROACHTEST_metamorphicBuild=false
ROACHTEST_ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
See: Grafana
This test on roachdash | Improve this report!
Jira issue: CRDB-35771
The text was updated successfully, but these errors were encountered: