GraphQL SSE crashes on large message stanza payload #4208

stsisnevich-hj · 2024-01-12T12:56:32Z

MongooseIM version: 6.2.0-44-g07a168ed1
Installed from: Helm Chart using mongooseim/mongooseim:latest image
Erlang/OTP version: Erlang/OTP 26 [erts-14.1.1] [source] [64-bit] [smp:8:1] [ds:8:1:10] [async-threads:1] [jit:ns]

Context

I am currently running MongooseIM in a Kubernetes cluster (EKS AWS) using the official Helm chart from ESL/MongooseHelm to provide a chat feature for my application. The client chat binding utilizes WebSocket to interact with chat. To make the chat interactive, a system bot is running and interacting with the chat using c2s XML stream, consuming and responding to user messages. Recently, I started a migration to GraphQL API to unify the communication approach on both sides and simplify the client implementation by achieving segregation between the consuming of messages and actions in the chat.

Issue

I have encountered a problem since migrating to GraphQL which is related to the SSE stream in MongooseIM. The problem occurs when broadcasting a large message stanza payload (~2000 characters) via GraphQL SSE stream. This action leads to a crash in the SSE stream which leads to the closing of EventSource, but the user's session becomes stuck in a live state (cannot be kicked) for an indefinite period of time. Subsequent reconnection attempts do not resolve the issue, and the SSE does not work (no messages broadcasted) for the affected user until the server is restarted. Meanwhile, WS and c2s don't crash on such messages and continue to work when SSE stream is broken.

Additional Information:

I intentionally use the MongooseIM latest version from the master branch (docker image latest tag), because I was struggling with SSE connection timeouts and they were addressed by merging the following MR: Increase idle_timeout for SSE connections #4196.
I am relying on mod_muc with mnesia backend and mod_mam with rdbms backend
Usually GraphQL SSE stream fails on 1600+ chars in message body, but always on 2000+

Attachments:

mongooseim.json.1

{"when":"2024-01-11T12:09:41.025376+00:00","what":"mongoose_graphql_sse_handler_failed","text":{"id":"28","data":["{\"data\":{\"stanza\":{\"subscribeForMessages\":{\"timestamp\":\"2024-01-11T12:09:41.025198Z\",\"stanza_id\":\"a99f7e0ee96f429bbeabb6e5b06fdcc9\",\"stanza\":\"<message id='a99f7e0ee96f429bbeabb6e5b06fdcc9' to='[email protected]' from='alice@xmpp-dev/1704-974935-740798-752907a8916cb029' type='groupchat'><body>test 1600 tttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt</body></message>\",\"sender\":","\"[email protected]/Alice\"}}}}"]},"reason":"badarg","meta":{"pid":"<0.1362.0>","mfa":"{mongoose_graphql_sse_handler,handle_error,3}","line":"94","file":"/home/circleci/project/src/graphql/mongoose_graphql_sse_handler.erl"},"level":"error"}
{"when":"2024-01-11T12:09:41.715586+00:00","report":[["{initial_call,{cowboy_stream_h,request_process,['Argument__1','Argument__2','Argument__3']}}","{pid,<0.1362.0>}","{registered_name,[]}","{error_info,{error,{badrecord,ok},[{lasse_handler,terminate,3,...},{cowboy_loop,call,...},{cowboy_stream_h,...},{...}|...]}}","{ancestors,[<0.1361.0>,<0.1110.0>,<0.1107.0>,<0.1106.0>,ranch_sup,<0.250.0>]}","{message_queue_len,0}","{messages,[]}","{links,[<0.1361.0>]}","{dictionary,[{rand_seed,{#{type => exsss,next => #Fun<rand.0.65977474>,bits => 58,...},[29127266007832741|184983864797656587]}}]}","{trap_exit,true}","{status,running}","{heap_size,4185}","{stack_size,28}","{reductions,64816}"],""],"meta":{"pid":"<0.1362.0>","mfa":"{proc_lib,crash_report,4}","logger_formatter":{"title":"CRASH REPORT"},"line":"584","file":"proc_lib.erl","error_logger":{"type":"crash_report","tag":"error_report"},"domain":["otp","sasl"]},"level":"error","label":"{proc_lib,crash}"}
{"when":"2024-01-11T12:09:41.716895+00:00","unstructured_log":"Ranch listener 'ejabberd_cowboy_0.0.0.0_8088', connection process <0.1361.0>, stream 1 had its request process <0.1362.0> exit with reason {badrecord,ok} and stacktrace [{lasse_handler,terminate,3,[{file,\"/home/circleci/project/_build/default/lib/lasse/src/lasse_handler.erl\"},{line,106}]},{cowboy_loop,call,5,[{file,\"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_loop.erl\"},{line,81}]},{cowboy_stream_h,execute,3,[{file,\"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_stream_h.erl\"},{line,306}]},{cowboy_stream_h,request_process,3,[{file,\"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_stream_h.erl\"},{line,295}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,241}]}]\n","meta":{"pid":"<0.1361.0>","error_logger":{"tag":"error"}},"level":"error"}

error.log

2024-01-11 12:09:41.715 [error] <0.1362.0>@lasse_handler:terminate:106 CRASH REPORT Process <0.1362.0> with 0 neighbours crashed with reason: bad record ok in lasse_handler:terminate/3 line 106
2024-01-11 12:09:41.717 [error] <0.1362.0>@lasse_handler:terminate:106 Cowboy stream 1 with ranch listener 'ejabberd_cowboy_0.0.0.0_8088' and connection process <0.1361.0> had its request process exit with reason: bad record ok in lasse_handler:terminate/3 line 106

crash.log

2024-01-11 12:09:41 =CRASH REPORT====
  crasher:
    initial call: cowboy_stream_h:request_process/3
    pid: <0.1362.0>
    registered_name: []
    exception error: {{badrecord,ok},[{lasse_handler,terminate,3,[{file,"/home/circleci/project/_build/default/lib/lasse/src/lasse_handler.erl"},{line,106}]},{cowboy_loop,call,5,[{file,"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_loop.erl"},{line,81}]},{cowboy_stream_h,execute,3,[{file,"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_stream_h.erl"},{line,306}]},{cowboy_stream_h,request_process,3,[{file,"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_stream_h.erl"},{line,295}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}
    ancestors: [<0.1361.0>,<0.1110.0>,<0.1107.0>,<0.1106.0>,ranch_sup,<0.250.0>]
    message_queue_len: 0
    messages: []
    links: [<0.1361.0>]
    dictionary: [{rand_seed,{#{type => exsss,next => #Fun<rand.0.65977474>,bits => 58,uniform => #Fun<rand.1.65977474>,uniform_n => #Fun<rand.2.65977474>,jump => #Fun<rand.3.65977474>},[29127266007832741|184983864797656587]}}]
    trap_exit: true
    status: running
    heap_size: 4185
    stack_size: 28
    reductions: 64816
  neighbours:
2024-01-11 12:09:41 =ERROR REPORT====
Ranch listener 'ejabberd_cowboy_0.0.0.0_8088', connection process <0.1361.0>, stream 1 had its request process <0.1362.0> exit with reason {badrecord,ok} and stacktrace [{lasse_handler,terminate,3,[{file,"/home/circleci/project/_build/default/lib/lasse/src/lasse_handler.erl"},{line,106}]},{cowboy_loop,call,5,[{file,"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_loop.erl"},{line,81}]},{cowboy_stream_h,execute,3,[{file,"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_stream_h.erl"},{line,306}]},{cowboy_stream_h,request_process,3,[{file,"/home/circleci/project/_build/default/lib/cowboy/src/cowboy_stream_h.erl"},{line,295}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]

Questions

Might something be wrong on the configuration level in my case?
Are there any suggestions to mitigate this issue?

Thank you for your attention to this matter. Let me know if further clarification or details are needed.

The text was updated successfully, but these errors were encountered:

gustawlippa · 2024-05-14T11:13:14Z

Thank you for reporting the issue.
It seems that this is a problem with the Cowboy library, but we'll investigate, as we've already encountered some problems with SSE configuration in the past.

JanuszJakubiec · 2024-09-17T12:16:45Z

Hi, the error was investigated and fixed in PR #4364.

JanuszJakubiec closed this as completed Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GraphQL SSE crashes on large message stanza payload #4208

GraphQL SSE crashes on large message stanza payload #4208

stsisnevich-hj commented Jan 12, 2024

gustawlippa commented May 14, 2024

JanuszJakubiec commented Sep 17, 2024

GraphQL SSE crashes on large message stanza payload #4208

GraphQL SSE crashes on large message stanza payload #4208

Comments

stsisnevich-hj commented Jan 12, 2024

Context

Issue

Additional Information:

Attachments:

Questions

gustawlippa commented May 14, 2024

JanuszJakubiec commented Sep 17, 2024