Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat][broker]: intercept transaction begin and end events for observability #14613

Merged
merged 7 commits into from
Mar 18, 2022

Conversation

madhavan-narayanan
Copy link
Contributor

@madhavan-narayanan madhavan-narayanan commented Mar 9, 2022

(If this PR fixes a github issue, please add Fixes #<xyz>.)

Fixes #

(or if this PR is one task of a github issue, please add Master Issue: #<xyz> to link to the master issue.)

Master Issue: #12858

Fixes: #12858

Motivation

This is to provide visibility to cluster operators when transactions are handled by the broker. Currently there is no way to know when a transaction begins and the produce/ack operations contained in it and the end status of the transaction (commit/rollback). Without this information, it is tough to support end customers where there are queries about missing messages or failed processing

Modifications

  • Enhanced org.apache.pulsar.broker.intercept.BrokerInterceptor interface to include additional events for tracing transaction events

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): ( no)
  • The public API: ( no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

@github-actions
Copy link

github-actions bot commented Mar 9, 2022

@madhavan-narayanan:Thanks for your contribution. For this PR, do we need to update docs?
(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

@github-actions
Copy link

github-actions bot commented Mar 9, 2022

@madhavan-narayanan:Thanks for providing doc info!

@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Mar 9, 2022
@gaoran10
Copy link
Contributor

gaoran10 commented Mar 9, 2022

Could we add these interceptor methods in PulsarCommandSender? And we could add a new test in BrokerInterceptorTest.

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please help add tests for covering the new changes.

* @param tcId Transaction Coordinator Id
* @param txnID Transaction ID
*/
default void beginTxn(long tcId, String txnID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to use newTxn to keep consistent with the command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And from the implementation, I think it should be txnOpened and txnEnded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 2173 to 2177
ctx.writeAndFlush(Commands.newTxnResponse(requestId, txnID.getLeastSigBits(),
txnID.getMostSigBits()));
if (getBrokerService().getInterceptor() != null) {
getBrokerService().getInterceptor().beginTxn(command.getTcId(), txnID.toString());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to move to PulsarCommandSender so that we can have a unified management of sent commands and interception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @codelipenghui ,
I have added unit tests and refactored code as you had recommended. Can you please review again?

@codelipenghui codelipenghui added this to the 2.11.0 milestone Mar 15, 2022
@gaoran10
Copy link
Contributor

Currently, there is a method onPulsarCommand in the interface BrokerInterceptor, this method could handle all Pulsar commands. If we add specific methods txnOpened and txnEnded, I think it's confusing. Do we need to add events in BrokerInterceptor for all commands?

@madhavan-narayanan
Copy link
Contributor Author

Currently, there is a method onPulsarCommand in the interface BrokerInterceptor, this method could handle all Pulsar commands. If we add specific methods txnOpened and txnEnded, I think it's confusing. Do we need to add events in BrokerInterceptor for all commands?

Hi @gaoran10 ,
Unfortunately the command objects available in the onPulsarCommand method do not have enough context to provide sufficient observability. For e.g the CommandEndTxnResponse object does not help to know if the transaction was committed or aborted.

@gaoran10
Copy link
Contributor

@madhavan-narayanan Maybe we could add a new param Object context for method onPulsarCommand to make it‘s suitable for more cases. I'm concerned that we need to add too many events in BrokerInterceptor.

@madhavan-narayanan
Copy link
Contributor Author

@madhavan-narayanan Maybe we could add a new param Object context for method onPulsarCommand to make it‘s suitable for more cases. I'm concerned that we need to add too many events in BrokerInterceptor.

@gaoran10 , I understand your concern. I can add another overloaded method for 'onPulsarCommand' with additional 'context' parameter to avoid proliferation of callbacks in the future. But is the opaque 'Object context' any concern? Developers implementing the interceptor have to understand the structure and semantics of the context by browsing the broker code or from the documentation of 'onPulsarCommand' which I am afraid can become unwieldy and ambiguous later.

@gaoran10
Copy link
Contributor

@madhavan-narayanan Ok, the method onPulsarCommand could get all Pulsar commands, and some commands have their own event method in BrokerInterceptor, maybe it's more convenient for some users. We could improve the onPulsarCommand in the future if needed. Thanks

@codelipenghui codelipenghui changed the title pulsar-broker: Intercept transaction begin and end events for observability [feat][broker]: intercept transaction begin and end events for observability Mar 18, 2022
@codelipenghui codelipenghui merged commit 86442ee into apache:master Mar 18, 2022
aparajita89 pushed a commit to aparajita89/pulsar that referenced this pull request Mar 21, 2022
…ability (apache#14613)

### Motivation

This is to provide visibility to cluster operators when transactions are handled by the broker. Currently there is no way to know when a transaction begins and the produce/ack operations contained in it and the end status of the transaction (commit/rollback). Without this information, it is tough to support end customers where there are queries about missing messages or failed processing
Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Apr 20, 2022
…ability (apache#14613)

### Motivation

This is to provide visibility to cluster operators when transactions are handled by the broker. Currently there is no way to know when a transaction begins and the produce/ack operations contained in it and the end status of the transaction (commit/rollback). Without this information, it is tough to support end customers where there are queries about missing messages or failed processing
kishorepulla pushed a commit to kishorepulla/pulsar that referenced this pull request Apr 21, 2022
…ability (apache#14613)

This is to provide visibility to cluster operators when transactions are handled by the broker. Currently there is no way to know when a transaction begins and the produce/ack operations contained in it and the end status of the transaction (commit/rollback). Without this information, it is tough to support end customers where there are queries about missing messages or failed processing
kishorepulla pushed a commit to kishorepulla/pulsar that referenced this pull request Apr 21, 2022
…ability (apache#14613)

This is to provide visibility to cluster operators when transactions are handled by the broker. Currently there is no way to know when a transaction begins and the produce/ack operations contained in it and the end status of the transaction (commit/rollback). Without this information, it is tough to support end customers where there are queries about missing messages or failed processing
yugadeepaIntuit pushed a commit to yugadeepaIntuit/pulsar that referenced this pull request Oct 10, 2022
…ability (apache#14613)

This is to provide visibility to cluster operators when transactions are handled by the broker. Currently there is no way to know when a transaction begins and the produce/ack operations contained in it and the end status of the transaction (commit/rollback). Without this information, it is tough to support end customers where there are queries about missing messages or failed processing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[PIP 106] Broker extensions to provide operators of enterprise-wide clusters better control and flexibility
3 participants