Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slack_socket_mode_no_reply_received_error bot stopped responding with socket mode #820

Closed
5 of 10 tasks
timja opened this issue Mar 2, 2021 · 10 comments
Closed
5 of 10 tasks
Labels
bug M-T: confirmed bug report. Issues are confirmed when the reproduction steps are documented

Comments

@timja
Copy link

timja commented Mar 2, 2021

Description

Hello we had an issue with an internal slack bot that we have written.

It stopped responding to us and when we checked the logs they were full of:

{"code":"slack_socket_mode_no_reply_received_error"}
[ERROR]  socket-mode:SocketModeClient:0 cannot send message when client is not connected
(node:1) UnhandledPromiseRejectionWarning: Error: Cannot send message when client is not connected
    at Object.sendWhileDisconnectedError (/opt/app/node_modules/@slack/socket-mode/dist/errors.js:56:26)
    at /opt/app/node_modules/@slack/socket-mode/dist/SocketModeClient.js:333:33
    at new Promise (<anonymous>)
    at SocketModeClient.send (/opt/app/node_modules/@slack/socket-mode/dist/SocketModeClient.js:329:16)
    at ack (/opt/app/node_modules/@slack/socket-mode/dist/SocketModeClient.js:466:24)
    at App.processEvent (/opt/app/node_modules/@slack/bolt/dist/App.js:424:19)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at async SocketModeClient.<anonymous> (/opt/app/node_modules/@slack/bolt/dist/receivers/SocketModeReceiver.js:90:13)

Not sure if it's a mistake in our implementation or something lower level, any guidance welcomed,

The code can be found here: https://github.com/hmcts/slack-help-bot/blob/main/app.js

What type of issue is this? (place an x in one of the [ ])

  • bug
  • enhancement (feature request)
  • question
  • documentation related
  • example code related
  • testing related
  • discussion

Requirements (place an x in each of the [ ])

  • I've read and understood the Contributing guidelines and have done my best effort to follow them.
  • I've read and agree to the Code of Conduct.
  • I've searched for any related issues and avoided creating a duplicate issue.

Bug Report

Filling out the following details about bugs will help us solve your issue sooner.

Reproducible in:

package version:
3.0.0

node version:
v14.8.0

OS version(s):
3.11.6
NAME="Alpine Linux"

Steps to reproduce:

  1. I assume it's to do with network connections not being handled properly but we only saw this once so far

Expected result:

bot reconnects automatically

Actual result:

bot was stuck in a disconnected state and didn't respond to new events / messages until it was restarted

Attachments:

https://github.com/hmcts/slack-help-bot/blob/main/app.js

@stevengill
Copy link
Member

Hey @timja!

Thanks for reporting this! Is the code still doing this or did the problem go away after you restarted the app?

@stevengill stevengill added bug M-T: confirmed bug report. Issues are confirmed when the reproduction steps are documented and removed untriaged labels Mar 3, 2021
@timja
Copy link
Author

timja commented Mar 4, 2021

Hey @timja!

Thanks for reporting this! Is the code still doing this or did the problem go away after you restarted the app?

Went away after restarting it

@stevengill
Copy link
Member

Glad to hear that! I'm going to leave this issue open for a little while to see if others are facing it.

@seratch
Copy link
Member

seratch commented Aug 27, 2021

FWIW, the underlying "ws" package has been fixing bugs in v7 minor/patch versions. If your app uses a bit old version of it, upgrading ws to the latest may be worth trying. See also: slackapi/node-slack-sdk#1322

@timja
Copy link
Author

timja commented Aug 27, 2021

Thanks will give it a try. We’ve seen this quite a few times

@timja
Copy link
Author

timja commented Sep 2, 2021

FWIW, the underlying "ws" package has been fixing bugs in v7 minor/patch versions. If your app uses a bit old version of it, upgrading ws to the latest may be worth trying. See also: slackapi/node-slack-sdk#1322

Seems we are on:

[email protected] /Users/timja/projects/hmcts/slack-help-bot
├─┬ @slack/[email protected]
│ └─┬ @slack/[email protected]
│ └── [email protected]
└─┬ [email protected]
└─┬ @jest/[email protected]
└─┬ [email protected]
└─┬ [email protected]
└─┬ [email protected]
└── [email protected] deduped

A little bit lower than the minimum version here: https://github.com/slackapi/node-slack-sdk/pull/1322/files, but either needs a resolution entry or a bump here I think? Is there any reason you aren't using the 8.x version?

@seratch
Copy link
Member

seratch commented Sep 2, 2021

@timja We haven't received any issue reports with ws v8 so far but the major version was released just a month ago. That's the only reason why we didn't upgrade the minimum major version this time. After watching for a while, we may consider changing the version range of ws in the future.

@Ohiekkar
Copy link

Ohiekkar commented Dec 21, 2021

Hello, I also faced an issue where the app was successfully running in socketMode and when a temporary network issue occurred, the app was unable to automatically reconnect and a manual restart of the application was required to get it running again.

I'm running @slack/bolt version 3.8.1 on Node 12.22.8 without any special configuration.

const app = new App({
  token: process.env.BOT_TOKEN,
  appToken: process.env.SLACK_APP_TOKEN,
  socketMode: true,
});

...

(async () => {
  await app.start();
})();

Logs:

out.log
0|index    | [INFO]  socket-mode:SocketModeClient:0 A ping wasn't received from the server before the timeout of 30000ms!
0|index    | [INFO]  socket-mode:SocketModeClient:0 unable to Socket Mode start: A request error occurred: getaddrinfo ENOTFOUND slack.com

err.log
...
0|index    | [WARN]  bolt-app http request failed getaddrinfo ENOTFOUND slack.com
0|index    | [WARN]  bolt-app http request failed getaddrinfo ENOTFOUND slack.com
0|index    | [ERROR]  socket-mode:SocketModeClient:0 Error: A request error occurred: getaddrinfo ENOTFOUND slack.com
0|index    |     at requestErrorWithOriginal (/var/www/taakkabot-event-proxy/node_modules/@slack/web-api/dist/errors.js:28:33)
0|index    |     at /var/www/taakkabot-event-proxy/node_modules/@slack/web-api/dist/WebClient.js:297:65
0|index    |     at processTicksAndRejections (internal/process/task_queues.js:97:5) {
0|index    |   code: 'slack_webapi_request_error',
0|index    |   original: Error: getaddrinfo ENOTFOUND slack.com
0|index    |       at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:66:26)
…

I tried to have my app catch the error and just kill the process if it happens, but it doesn't seem to end up in the global error handler app.error

@timja
Copy link
Author

timja commented Dec 21, 2021

@Ohiekkar I implemented this work around to have my process monitor (Kubernetes in this case) restart it if this happened:
hmcts/slack-help-bot#46

No reports of issues since implementing that

@seratch
Copy link
Member

seratch commented Apr 15, 2022

@timja Thanks for sharing the workaround 🙇

I've applied a bunch of improvements to the underlying @slack/socket-mode library. The behavior can be much stabler with the latest RC version. You can try the updated one by installing the RC version by running the following command:

npm i @slack/socket-mode@next
# or npm i @slack/[email protected]

If you are interested in all the changes included in the release, refer to https://github.com/slackapi/node-slack-sdk/releases/tag/%40slack%2Fsocket-mode%401.3.0-rc.0 for the details. If everything goes well, we will release v1.3.0 next week!

Also, SocketModeClient#badConnection is a private property in the class. If you continue using your health check endpoint, switching to the newly added public method isActive(): boolean would be a safer way in terms of long-term maintenance.

Since we won't have any further activities on this thread, let me close this issue now. Thanks again for taking the time to share the issue here!

@seratch seratch closed this as completed Apr 15, 2022
@filmaj filmaj removed this from the 3.x milestone Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug M-T: confirmed bug report. Issues are confirmed when the reproduction steps are documented
Projects
None yet
Development

No branches or pull requests

5 participants