Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Strimzi Metrics Reporter #954

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fvaleri
Copy link

@fvaleri fvaleri commented Dec 17, 2024

This patch adds support for metrics types configuration, integrating the Strimzi Metrics Reporter), and custom Prometheus JMX Exporter configuration. The custom configuration feature is needed for Cluster Operator integration, but it can also be leveraged by standalone users. See proposal 43 for more details.

@fvaleri fvaleri requested a review from ppatierno December 17, 2024 09:26
@fvaleri
Copy link
Author

fvaleri commented Dec 17, 2024

@OwenCorrigan76 @mimaison fyi

Copy link

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not looked at the code yet. I left comments trying to understand how this will be used.

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should go through a dedicated proposal first. The original proposal - https://github.com/strimzi/proposals/blob/main/064-prometheus-metrics-reporter.md - sadly didn't cover the bridge in sufficient depth and there will be some backward compatibility impact here or in the operators repo that needs to be carefully considered. A proposal would be the right place to discuss it and decide on the best approach.

@fvaleri
Copy link
Author

fvaleri commented Dec 17, 2024

there will be some backward compatibility impact here or in the operators repo that needs to be carefully considered

I agree. @ppatierno wdyt?

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left just some minor comments but I will review it again when the proposal is accepted (to avoid reviewing now something that will be changed later by the final proposal)

config/application.properties Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
pom.xml Show resolved Hide resolved
pom.xml Outdated Show resolved Hide resolved
@fvaleri fvaleri force-pushed the bridge-metrics-rep branch from 8b9e345 to d0eedcd Compare January 2, 2025 08:04
@fvaleri fvaleri added this to the 0.32.0 milestone Jan 2, 2025
@fvaleri fvaleri force-pushed the bridge-metrics-rep branch from ddabfb1 to 5084403 Compare January 17, 2025 17:42
@fvaleri fvaleri changed the title Add support for the Strimzi Metrics Reporter Add support for for metrics types configuration Jan 24, 2025
@fvaleri fvaleri changed the title Add support for for metrics types configuration Add support for Strimzi Metrics Reporter Jan 24, 2025
This patch adds support for metrics types configuration, integrating the Strimzi Metrics Reporter, and custom Prometheus JMX Exporter configuration.
The custom configuration feature is needed for Cluster Operator integration, but it can also be leveraged by standalone users.
See Proposal 43 for more details.

Signed-off-by: Federico Valeri <[email protected]>
@fvaleri
Copy link
Author

fvaleri commented Jan 27, 2025

@strimzi/maintainers proposal 92 has been accepted, and this PR is aligned, so open for review. Thanks.

@@ -9,6 +9,15 @@
* Both the `/openapi` and `/openapi/v3` endpoints return the OpenAPI v3 definition of the bridge REST API.
* Removed script to build bridge configuration within the container.
It is going to be set up by the Strimzi operator within a ConfigMap and mounted as volume on the bridge pod.
* Added support for [Strimzi Metrics Reporter](https://github.com/strimzi/metrics-reporter).
This is a Kafka plugin that directly exposes metrics in Prometheus format without using JMX, and can be enabled by setting `bridge.metrics=strimziMetricsReporter`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think users really understand the JMX reference here and the use of JMX is actually completely internal to what it replaces.

@@ -0,0 +1,33 @@
[id='proc-configuring-kafka-bridge-jmx-metrics-{context}']
= Configuring JMX Exporter metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
= Configuring JMX Exporter metrics
= Configuring Prometheus JMX Exporter metrics

@@ -8,11 +8,13 @@
[role="_abstract"]
Configure a deployment of the Kafka Bridge using configuration properties.
Configure Kafka and specify the HTTP connection details needed to be able to interact with Kafka.
It is possible to enable metrics in Prometheus format with JMX Prometheus Exporter or Strimzi Metrics Reporter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is possible to enable metrics in Prometheus format with JMX Prometheus Exporter or Strimzi Metrics Reporter.
It is possible to enable metrics in Prometheus format with Prometheus JMX Exporter or Strimzi Metrics Reporter.

bridge.metrics=jmxPrometheusExporter
----
+
Optionally, you can set a custom JMX Exporter configuration using the `bridge.metrics.exporter.config.path` property.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Optionally, you can set a custom JMX Exporter configuration using the `bridge.metrics.exporter.config.path` property.
Optionally, you can set a custom Prometheus JMX Exporter configuration using the `bridge.metrics.exporter.config.path` property.

./bin/kafka_bridge_run.sh --config-file=<path>/application.properties
----
+
With metrics enabled, you can use `GET /metrics` with the `/metrics` endpoint to retrieve Kafka Bridge metrics in Prometheus format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With metrics enabled, you can use `GET /metrics` with the `/metrics` endpoint to retrieve Kafka Bridge metrics in Prometheus format.
With metrics enabled, you can scrape metrics in the Prometheus format from the `/metrics` endpoint of the Kafka Bridge.

Comment on lines +609 to +610
} else if (routingContext.statusCode() == HttpResponseStatus.NOT_IMPLEMENTED.code()) {
message = HttpResponseStatus.NOT_IMPLEMENTED.reasonPhrase();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we using the not-implemented code for? Not implemented is not the same as not enabled.

Comment on lines +854 to +865
static class NotSupportedApi extends HttpOpenApiOperation {
public NotSupportedApi(HttpOpenApiOperations operationId) {
super(operationId);
}

@Override
public void process(RoutingContext routingContext) {
HttpBridgeError error = new HttpBridgeError(HttpResponseStatus.GONE.code(), "OpenAPI v2 Swagger not supported");
HttpUtils.sendResponse(routingContext, HttpResponseStatus.GONE.code(),
BridgeContentType.KAFKA_JSON, JsonUtils.jsonToBytes(error.toJson()));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really related to the metrics?

@@ -43,14 +44,14 @@ public ProducerRecord<byte[], byte[]> toKafkaRecord(String kafkaTopic, Integer p
if (!keyNode.isTextual()) {
throw new IllegalStateException("Because the embedded format is 'text', the key must be a string");
}
key = keyNode.asText().getBytes();
key = keyNode.asText().getBytes(Charset.forName("UTF-8"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, are these changes really related to the metrics? How?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be rather some abstract class / interface with two implementations then this kind of union class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently it's the same (I guess the new file shows up because of the new metrics subpackage).
This class has the final goal to allow scraping (see scrape method) metrics from several registries: the JMX exporter one (for Kafka related metrics) and Vert.x one (for HTTP related metrics).
The weird thing is that it would contain now two registry instances which can't live together: JMX and Strimzi reporter.
Maybe this class should be more a MetricsScraper and we should have an abstrac/interface class named MetricsReporter for two different reporters?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be some registry interface or something?

@@ -1,5 +1,16 @@
#Bridge related settings
bridge.id=my-bridge

# uncomment the following line to enable JMX Prometheus Exporter, check the documentation for more details
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# uncomment the following line to enable JMX Prometheus Exporter, check the documentation for more details
# uncomment the following line to enable Prometheus JMX Exporter, check the documentation for more details

Optionally, you can set a comma-separated list of regexes used to filter exposed metrics using the `kafka.prometheus.metrics.reporter.allowlist` property.
If not set, all available metrics will be exposed.
+
You can add any plugin configuration to the Bridge properties file using the `kafka.` prefix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, by using kafka. you are providing configuration which will be common to all the Kafka clients used by the bridge (producer, consumer, admin). You could use kafka.producer., kafka.consumer. or kafka.admin. prefixes to set configuration for a specific client. I was wondering, does the Strimzi metrics reporter work in this case?

&& !metricsTypeValue.equals(MetricsType.STRIMZI_REPORTER.toString())) {
throw new IllegalArgumentException(
String.format("Invalid %s configuration, choose one of %s and %s",
METRICS_TYPE, MetricsType.JMX_EXPORTER, MetricsType.STRIMZI_REPORTER)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit but you are in a if condition related to metrics validation only so the error message can only be related to that, not sure how much String formatting makes sense instead of a hard-coded error message.

if (metricsTypeValue.equals(MetricsType.STRIMZI_REPORTER.toString())) {
map.putIfAbsent("kafka.metric.reporters", "io.strimzi.kafka.metrics.KafkaPrometheusMetricsReporter");
map.putIfAbsent("kafka.prometheus.metrics.reporter.listener.enable", "false");
map.putIfAbsent("kafka.prometheus.metrics.reporter.allowlist", ".*");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question ... I would expect that this part will be always set by the operator when the bridge is running within Kubernetes. We should also think about having under the overall kafka. prefix or for all specific components (i.e. do we allow the user to set different allowlist for producer, consumer, admin?).
If running on bare metal/VM are we sure that we want to set defaults instead of raising an exception for missing configuration?

/**
* @return the metric system to be used in the bridge
*/
public String getMetrics() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

}

/**
* @return the JMX Exporter configuration file path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @return the JMX Exporter configuration file path
* @return the Prometheus JMX Exporter configuration file path

* @throws IOException
*/
private static JmxCollectorRegistry getJmxCollectorRegistry(BridgeConfig bridgeConfig) throws MalformedObjectNameException, IOException {
if (bridgeConfig.getJmxExporterConfigPath() != null && Files.exists(bridgeConfig.getJmxExporterConfigPath())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But AFAICS it falls back to use the default configuration from the YAML within the jar. It's not ignoring, or?

Comment on lines +169 to +171
if (metricsReporter != null) {
routerBuilder.operation(this.METRICS.getOperationId().toString()).handler(this.METRICS);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think that the current implementation thanks to the default Vert.x routing will just return 404 NOT FOUND if you hit the /metrics endpoint but metrics are not enabled.
I would leave the same behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently it's the same (I guess the new file shows up because of the new metrics subpackage).
This class has the final goal to allow scraping (see scrape method) metrics from several registries: the JMX exporter one (for Kafka related metrics) and Vert.x one (for HTTP related metrics).
The weird thing is that it would contain now two registry instances which can't live together: JMX and Strimzi reporter.
Maybe this class should be more a MetricsScraper and we should have an abstrac/interface class named MetricsReporter for two different reporters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants