Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][pulsar-io-kafka] Add option to copy Kafka headers to Pulsar properties #1

Closed
wants to merge 1 commit into from

Conversation

aymkhalil
Copy link
Owner

@aymkhalil aymkhalil commented Sep 22, 2022

Fixes apache#17760

Master Issue: #

Motivation

Users of Kafka source today can't access their Kafka headers in Pulsar. This patch adds a config to enable this copy to take place in addition to some standard, useful Kafka headers.

Modifications

  • A new KafkaSourceConfig flag is added: copyHeadersEnabled of type boolean and default value of false
  • If the flag is toggle on, the following Kafka headers will be copied over to Pulsars properties.
  • __kafka_topic : the name of the Kafka topic consumed by the source.
  • __kafka_partition: the number of paritioed on the consumed kafka topics
  • __kafka_offset: the current offset of the kafka message currently consumed by the source
  • User defined header: The customer headers a Kafka producer could generated as pairs of <String, byte[]>. Since Pulsar properties are of type <String, String>, the Kafka byte[] values are based64 encoded.

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (add or upgrade a dependency) (no)
  • The public API (no)
  • The schema (no)
  • The default values of configurations (no)
  • The binary protocol (no)
  • The REST endpoints (no)
  • The admin CLI options (no)
  • Anything that affects deployment (no)

Documentation

  • doc-required
    (Your PR needs to update docs and you will update later)

  • doc-not-needed
    (Please explain why)

  • doc
    (Your PR contains doc changes)

  • doc-complete
    (Docs have been already added)

Matching PR in forked repository

PR in forked repository: #1

@aymkhalil aymkhalil force-pushed the pulsar-io-kafka-headers branch 2 times, most recently from c0f0eaa to cef00bc Compare September 23, 2022 00:31
@aymkhalil aymkhalil changed the title [feature][pulsar-io-kafka] Add option to copy Kafka headers to Pulsar… [improve][pulsar-io-kafka] Add option to copy Kafka headers to Pulsar properties Sep 23, 2022
Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@@ -189,19 +195,35 @@ public void start() {

public abstract KafkaRecord buildRecord(ConsumerRecord<Object, Object> consumerRecord);

protected Map<String, String> copyKafkaHeaders(ConsumerRecord<Object, Object> consumerRecord) {
Map<String, String> properties = new HashMap<>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can save a allocations by returning Collections.emptyMap here if the feature is not enabled

ImmutableMap.of("schema.registry.url", getRegistryAddressInDockerNetwork())
);
ImmutableMap.of("schema.registry.url", getRegistryAddressInDockerNetwork()));
sourceConfig.put("copyHeadersEnabled", true);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integ test is failing because the kafka-avro-console-producer doesn't respect the headers.pars option. Will disable the test for now and add it as needed.

confluentinc/schema-registry#2390

@aymkhalil aymkhalil force-pushed the pulsar-io-kafka-headers branch from cef00bc to c1dada7 Compare September 23, 2022 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kafka Source: add option to copy Kafka Headers to Pulsar properties
2 participants