Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][pip] PIP-352: Event time based topic compactor #22710

Merged
merged 3 commits into from
Jul 31, 2024

Conversation

marekczajkowski
Copy link
Contributor

@marekczajkowski marekczajkowski commented May 14, 2024

PIP: 352

Motivation

Currently, there are two types of compactors
available: TwoPhaseCompactor and StrategicTwoPhaseCompactor. The latter
is specifically utilized for internal load balancing purposes and is not
employed for regular compaction of Pulsar topics. On the other hand, the
former can be configured via CompactionServiceFactory in the
broker.conf.

I believe it could be advantageous to introduce another type of topic
compactor that operates based on event time. Such a compactor would have
the capability to maintain desired messages within the topic while
preserving the order expected by external applications. Although
applications may send messages with the current event time, variations in
network conditions or redeliveries could result in messages being stored in
the Pulsar topic in a different order than intended. Implementing event
time-based checks could mitigate this inconvenience.

Modifications

Added PIP

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added PIP doc-required Your PR changes impact docs and you will update later. labels May 14, 2024
@dao-jun dao-jun added this to the 3.4.0 milestone May 14, 2024
@dao-jun dao-jun added doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. and removed doc-required Your PR changes impact docs and you will update later. labels May 14, 2024
@lhotari
Copy link
Member

lhotari commented May 15, 2024

Added a comment about an unsolved challenge: #22517 (comment)

@marekczajkowski
Copy link
Contributor Author

@lhotari what are the next steps to proceed ?

@lhotari
Copy link
Member

lhotari commented Jun 19, 2024

Added a comment about an unsolved challenge: #22517 (comment)

this has been addressed.

@lhotari
Copy link
Member

lhotari commented Jun 19, 2024

@lhotari what are the next steps to proceed ?

I've described this in the email response to the discussion thread:
https://lists.apache.org/thread/ocrbhlhs049px5w9mz9gfym4wpq4701f

Please start a new vote thread for PIP-352.

@heesung-sn
Copy link
Contributor

can this be implemented by StrategicTwoPhaseCompactor with another compaction strategy??

@marekczajkowski
Copy link
Contributor Author

can this be implemented by StrategicTwoPhaseCompactor with another compaction strategy??

Not really StrategicTwoPhaseCompactor is specifically utilized for internal load balancing purposes and is not
employed for regular compaction of Pulsar topics

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PIP-352 has been approved in this voting thread: https://lists.apache.org/thread/pp6c0qqw51yjw9szsnl2jbgjsqrx7wkn

@lhotari lhotari merged commit 9d0292e into apache:master Jul 31, 2024
20 checks passed
grssam pushed a commit to grssam/pulsar that referenced this pull request Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. PIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants