Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(plugin-server): run node-rdkafka with cooperative rebalancing patched in #17747

Merged
merged 5 commits into from
Oct 10, 2023

Conversation

bretthoerner
Copy link
Contributor

@bretthoerner bretthoerner commented Oct 3, 2023

Problem

We run a fork of node-rdkafka that supports cooperative rebalancing but doesn't keep up with upstream node-rdkafka nor librdkafka. The main (Blizzard) upstream repo seems to be alive and has some bugfixes, features, and runs the latest librdkafka (2.2.0). I prefer to stay up to date on clients for systems like Kafka where the clients are smart and do a lot of coordination.

So...

We could, alternatively, patch acosom-node-rdkafka up to librdkafka 2.2.0. 🤷 I'm indifferent, these 2 repos are pretty similar.

Changes

Swap acosom-node-rdkafka for upstream node-rdkafka with the acosom cooperative rebalance patch applied.

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

How did you test this code?

Existing tests. We'll need to carefully roll this out.

@PostHog PostHog deleted a comment from posthog-bot Oct 3, 2023
@bretthoerner bretthoerner requested a review from a team October 3, 2023 21:29
@bretthoerner
Copy link
Contributor Author

bretthoerner commented Oct 4, 2023

I talked to @xvello about this. Their preference was to use a Git(Hub) dependency, so I tried that.

  1. Our new patched repo (with working CI) is here: https://github.com/PostHog/node-rdkafka/
  2. Attempt to use a Git(Hub) dependency failed because pnpm (technically yarn, which is uses for deps) don't leave you with a proper git clone but instead use git archive, and so the librdkafka submodule isn't available: 743045c
$ pnpm i
Packages: +1 -1
+-
Progress: resolved 1411, reused 1402, downloaded 0, added 0, done
node_modules/.pnpm/github.com+posthog+node-rdkafka@e22c2a9e3b1b49b6b98a7489980ef990a29c3a28/node_modules/node-rdkafka: Running install script, failed in 596ms
.../node_modules/node-rdkafka install$ node-gyp rebuild
│ gyp info it worked if it ends with ok
│ gyp info using [email protected]
│ gyp info using [email protected] | darwin | arm64
│ gyp info find Python using Python version 3.11.5 found at "/opt/homebrew/opt/[email protected]/bin/python3.11"
│ gyp info spawn /opt/homebrew/opt/[email protected]/bin/python3.11
│ gyp info spawn args [
│ gyp info spawn args   '/Users/brett/.nvm/versions/node/v18.17.1/lib/node_modules/pnpm/dist/node_modules/node-gyp/gyp/gyp_main.py',
[...snip...]
│ gyp info spawn args ]
│ gyp info spawn make
│ gyp info spawn args [ 'BUILDTYPE=Release', '-C', 'build' ]
│   ACTION deps_librdkafka_gyp_librdkafka_target_configure deps/librdkafka/config.h
│ ./configure: line 13: ./configure: No such file or directory

configure isn't there because the submodule isn't initialized, and submodules (and the git repo itself) aren't available at all. This is apparently a known and very longstanding issue: yarnpkg/yarn#1488

We have a few options:

  1. Try to make pnpm (and yarn) do the right thing here. (No thanks?)
  2. Add a hack inside our fork that manually git clones the librdkafka dependency before building. This sucks because we'll need to duplicate the librdkafa hash somewhere, since we don't have the git repo available.
  3. Combine the two ideas, carry a dependency+patch like this PR, but now we have a repo we can rebase and generate patches from. (My preference, I've updated this PR with a patch generated by doing git diff blizzard/master in that repo, where blizzard is obviously a remote pointing to upstream)

Thoughts?

Copy link
Contributor

@xvello xvello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into it!

Sorry about the pnpm git dependency side-quest, thanks for looking into it though. There's a risk of subtle breakages from changing rdk, so let's target a Tuesday deploy next week?
We should also document how the patch was generated . Maybe add a section in plugin-server/README.md?

@bretthoerner
Copy link
Contributor Author

so let's target a Tuesday deploy next week? We should also document how the patch was generated . Maybe add a section in plugin-server/README.md?

Added directions, and Tuesday sounds good.

@bretthoerner bretthoerner merged commit d68fa14 into master Oct 10, 2023
@bretthoerner bretthoerner deleted the brett/rdkafka-patch branch October 10, 2023 14:13
daibhin pushed a commit that referenced this pull request Oct 23, 2023
…atched in (#17747)

* Remove node-rdkafka-acosom

* Add node-rdkafka

* Replace node-rdkafka-acosom imports with node-rdkafka

* Patch node-rdkafka with changes from https://github.com/PostHog/node-rdkafka/

* Add patch directions to README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants