Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket-mode: fix bug when apps.connections.open returns an error and won't retry #1735

Merged
merged 1 commit into from
Jan 26, 2024

Conversation

filmaj
Copy link
Contributor

@filmaj filmaj commented Jan 25, 2024

Fixes #1500.

The root of the bug seems to be when trying to emit to a state from within setTimeout wrapper function. The fix is to return a promise in the do state machine function, resolve the promise after the dynamic waiting period, and use onSuccess and transitionTo to transition to the correct state.

You can test the faulty behaviour by running the test in this PR against the unmodified code prior to the changes introduced in this PR. It helps to set the client's log level to DEBUG to get insight into behaviour. Without the patch in this PR, the debug output would look like this if apps.connections.open returns an error:

[INFO]  socket-mode:SocketModeClient:0 Going to establish a new connection to Slack ...
[DEBUG]  socket-mode:SocketModeClient:0 Transitioning to state: connecting:authenticating
[DEBUG]  socket-mode:SocketModeClient:0 Going to retrieve a new WSS URL ...
[DEBUG]  web-api:WebClient:0 apiCall('apps.connections.open') start
[DEBUG]  web-api:WebClient:0 http request url: http://localhost:12345/apps.connections.open
[DEBUG]  web-api:WebClient:0 http request body: {}
[DEBUG]  web-api:WebClient:0 http request headers: {}
[DEBUG]  web-api:WebClient:0 http response received
[DEBUG]  web-api:WebClient:0 http request result: {"ok":false,"error":"fatal_error","response_metadata":{}}
[ERROR]  socket-mode:SocketModeClient:0 Failed to retrieve a new WSS URL for reconnection (error: Error: An API error occurred: fatal_error)
[WARN]  socket-mode:SocketModeClient:0 Failed to start a Socket Mode connection (error: An API error occurred: fatal_error)
[DEBUG]  socket-mode:SocketModeClient:0 Transitioning to state: connecting:reconnecting
[DEBUG]  socket-mode:SocketModeClient:0 Before trying to reconnect, this client will wait for 25 milliseconds

.. and will sit stuck there (because the state is not being transitioned).

With the patch in this PR, the logs now are (with the test simulating two calls to apps.connections.open failing but succeeding on the third one):

[INFO]  socket-mode:SocketModeClient:2 Going to establish a new connection to Slack ...
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connecting:authenticating
[DEBUG]  socket-mode:SocketModeClient:2 Going to retrieve a new WSS URL ...
[DEBUG]  web-api:WebClient:2 apiCall('apps.connections.open') start
[DEBUG]  web-api:WebClient:2 http request url: http://localhost:12345/apps.connections.open
[DEBUG]  web-api:WebClient:2 http request body: {}
[DEBUG]  web-api:WebClient:2 http request headers: {}
[DEBUG]  web-api:WebClient:2 http response received
[DEBUG]  web-api:WebClient:2 http request result: {"ok":false,"error":"fatal_error","response_metadata":{}}
[ERROR]  socket-mode:SocketModeClient:2 Failed to retrieve a new WSS URL for reconnection (error: Error: An API error occurred: fatal_error)
[WARN]  socket-mode:SocketModeClient:2 Failed to start a Socket Mode connection (error: An API error occurred: fatal_error)
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connecting:reconnecting
[DEBUG]  socket-mode:SocketModeClient:2 Before trying to reconnect, this client will wait for 25 milliseconds
[DEBUG]  socket-mode:SocketModeClient:2 Resolving reconnecting state to continue with reconnect...
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connecting:authenticating
[DEBUG]  socket-mode:SocketModeClient:2 Going to retrieve a new WSS URL ...
[DEBUG]  web-api:WebClient:2 apiCall('apps.connections.open') start
[DEBUG]  web-api:WebClient:2 http request url: http://localhost:12345/apps.connections.open
[DEBUG]  web-api:WebClient:2 http request body: {}
[DEBUG]  web-api:WebClient:2 http request headers: {}
[DEBUG]  web-api:WebClient:2 http response received
[DEBUG]  web-api:WebClient:2 http request result: {"ok":false,"error":"fatal_error","response_metadata":{}}
[ERROR]  socket-mode:SocketModeClient:2 Failed to retrieve a new WSS URL for reconnection (error: Error: An API error occurred: fatal_error)
[WARN]  socket-mode:SocketModeClient:2 Failed to start a Socket Mode connection (error: An API error occurred: fatal_error)
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connecting:reconnecting
[DEBUG]  socket-mode:SocketModeClient:2 Before trying to reconnect, this client will wait for 50 milliseconds
[DEBUG]  socket-mode:SocketModeClient:2 Resolving reconnecting state to continue with reconnect...
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connecting:authenticating
[DEBUG]  socket-mode:SocketModeClient:2 Going to retrieve a new WSS URL ...
[DEBUG]  web-api:WebClient:2 apiCall('apps.connections.open') start
[DEBUG]  web-api:WebClient:2 http request url: http://localhost:12345/apps.connections.open
[DEBUG]  web-api:WebClient:2 http request body: {}
[DEBUG]  web-api:WebClient:2 http request headers: {}
[DEBUG]  web-api:WebClient:2 http response received
[DEBUG]  web-api:WebClient:2 http request result: {"ok":true,"url":"ws://localhost:23456/","response_metadata":{}}
[DEBUG]  web-api:WebClient:2 apiCall('apps.connections.open') end
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connecting:authenticated
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connecting:handshaking
[DEBUG]  socket-mode:SocketModeClient:2 Received a message on the WebSocket: {"type":"hello"}
[DEBUG]  socket-mode:SocketModeClient:2 Transitioning to state: connected
[INFO]  socket-mode:SocketModeClient:2 Now connected to Slack

@filmaj filmaj added bug M-T: A confirmed bug report. Issues are confirmed when the reproduction steps are documented semver:patch pkg:socket-mode applies to `@slack/socket-mode` labels Jan 25, 2024
@filmaj filmaj added this to the [email protected] milestone Jan 25, 2024
@filmaj filmaj requested review from seratch and a team January 25, 2024 21:25
@filmaj filmaj self-assigned this Jan 25, 2024
Copy link
Member

@seratch seratch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! LGTM

@seratch
Copy link
Member

seratch commented Jan 26, 2024

@filmaj A bit off-topic but would it be feasible to move the integration test into src and run it as part of CI builds for future regression detection? If there is a certain reason to place the test under test directory, we may want to execute the npm command to run it in CI builds.

@WilliamBergamin
Copy link
Contributor

Nice 💯 this is was a tricky bug, and now has an elegant solution

@filmaj
Copy link
Contributor Author

filmaj commented Jan 26, 2024

@seratch it is being executed in CI build: https://github.com/slackapi/node-slack-sdk/actions/runs/7660650322/job/20878386230?pr=1735#step:5:88
(it is part of the npm test script)

I will merge this but if you feel strongly about moving the integration tests into src/ we can do so in a follow-up PR. I have no strong feelings about it, I just assumed, as we do with other packages, that the tests under src/ are more in the style of unit tests (since they correlate closely to individual source files).

@filmaj filmaj merged commit 46b500d into main Jan 26, 2024
15 checks passed
@filmaj filmaj deleted the connections-error-retry branch January 26, 2024 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug M-T: A confirmed bug report. Issues are confirmed when the reproduction steps are documented pkg:socket-mode applies to `@slack/socket-mode` semver:patch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SocketModeClient still fails to automatically reconnect when apps.connections.open API returns an error code
3 participants