Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade from 3.1.2 to 4 has broken message delivery #2151

Open
binarymist opened this issue May 29, 2022 · 5 comments
Open

Upgrade from 3.1.2 to 4 has broken message delivery #2151

binarymist opened this issue May 29, 2022 · 5 comments
Labels

Comments

@binarymist
Copy link

binarymist commented May 29, 2022

Environment:

  • Node.js Version: 17
  • Redis Server Version: v=6.2.3
  • Node Redis Version: 4.1.0
  • Platform: Dev platform: Release Linux Mint 20.3 Una 64-bit Prod platform: Any Linux

This is a continuation from #1870 (@leibale you were working on that). We had to release with version 3.1.2 (our existing version) as we couldn't keep consumers waiting any longer. Now circling back to work out what the actual issue is in hope that we can upgrade node redis.

At this stage it's hard to tell. We're not getting any errors. We've upgraded node redis to the latest now (4.1.0), read the Changelog again, been over the v3 to v4 MIgration Guide many times and made the required changes, also in line with createClient configuration.

This is our consolidated diff between the code that works with 3.1.2 (main branch) and the upgraded code to satisfy 4.1.0 requirements (binarymist/upgrade-incl-redis branch).

The most obvious place to be focussing on is in the testerWatcher.

We have two different communication techniques from our CLI to the orchestrator (Server Sent Events (sse), and long polling (lp)). The sse appears to be working fine with the upgrade of node redis, but the lp is not.

The change you see in the get.js is only applicable to sse so I don't think there's any issue in get.js.

We have multiple Test Sessions (each relly on their specific redis channel. The first one seems to work). The changes to the orchestrator.js are unlikely to be an issue if one of the redis channels is working, which leaves our foucs to the testerWatcher... and specifically pollTesterMessages.

There is either something we've somehow missed changing with the upgrade, although I can't see this being the case as we've gone over your upgrade path a good number of times over the last few months. Or there is a bug in node redis. Either way it needs to be fixed in order for us to upgrade node redis.

A high level overview of our architecture can be seen here.

Thanks.

@binarymist binarymist added the Bug label May 29, 2022
@leibale
Copy link
Contributor

leibale commented May 30, 2022

@binarymist what's the issue? you are getting an error? not getting messages?

@binarymist
Copy link
Author

binarymist commented May 30, 2022

No error, but only getting messages for the tls tester. In saying that, occasionally we get messages for the app tester as well. Just FYI: We currently have two testers implemented app and tls.

I'll continue debugging over the next few days as I get time and report back on any findings or lack of. At this stage, I'm just wondering if you see anything obvious in our changes that don't look right? The only changes in this change-set are to bring node redis up-to-date.

Thanks.

@binarymist
Copy link
Author

OK, so our changes to update the node redis client are all good? But something is broken.

Have spent about a week on this all up so far...

The three areas we narrowed down to and investigated:

  1. Test that the callback to subscribe in the function pollTesterMessages of testerWatcher.js is being invoked for every message and channel we expect.
    It seems right. Also when we have a break point on line 93 only of testerWatcher.js, that's the following code:
    await nonBlockingClients[channel].rPush(redisList, message);
    The Test Run seems to run fine.
    This tells us that the redis pubSub channels are fine and it tells us that our messages are attempted at being rPushed onto the redis lists,
    but when I inspect the nonBlockingClients[channel] (via consol.log rather than a break point) for the messages in it, many are missing.
    Why is this? Point 3 answers this?

    Without the break point, often we don't get any messages back to the CLI, but occasionally we do.
    If we run the CLI again, we then get the messages back. I think fixing point 3 also explains this?

  2. Test that getTesterMessages in the function pollTesterMessages of testerWatcher.js is getting the testerMessageSet's that we expect.
    It's not.
    Sometimes we get all the expected messages, sometimes we don't. I think that point 3 answers this?
    Sometimes we get duplicate messages. Don't think point 3 answers this @leibale?

  3. Work out why call to function getTesterMessages from pollTesterMessages of testerWatcher.js does not return correct testerMessageSet that we expect
    It appears that blpop now returns an object with key and element rather than the expected and documented (https://redis.io/commands/blpop/#return) array? @leibale did we miss this change being documented somewhere?
    Next: Potentially debug code in following block of getTesterMessages of testerWatcher.js:
    if (curListLen > 0) {
    Although I'm pretty sure this is fine

Are there other undocumented changes or possibly documented changes we've missed?

Thanks.

binarymist added a commit to purpleteam-labs/purpleteam-orchestrator that referenced this issue Jun 11, 2022
@binarymist
Copy link
Author

Would be good to get some feedback on this?

@leibale
Copy link
Contributor

leibale commented Jun 11, 2022

@binarymist sorry it took so long, but I was on vacation...

regarding point 3 - some commands have transformReply function that transforms the raw reply (which in some cases is hard to use/parse) into more convenient data structures. if you want to get the raw reply - use client.sendCommand directly.

In order to see what a command returns you can:

  1. trust your IDE, it uses the built-in TypeScript declarations.
  2. checkout the transformReply function in the command file (for example, here is the BLPOP file)
  3. use the docs (https://redis.js.org/documentation/client/)

if you want, I would like to try and debug those bugs with you on Monday (around 12 pm NYC time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants