Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: NRF52 nodes becomes unresponsive after upgrade to 2.5.8. Node is working properly if I disable Neighbour Info module #5235

Closed
iondulgheru opened this issue Nov 3, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@iondulgheru
Copy link

iondulgheru commented Nov 3, 2024

Category

Other

Hardware

Rak4631

Firmware Version

2.5.8

Description

After I upgraded to 2.5.8 my RAK4631 nodes started becoming unresponsive. I replicated this also on 2.5.9 and 2.5.10.
The problem is solved if I disable the Neighbour Info module or if I downgrade to 2.5.7 .
From my tests the node could send messages, but doesn't receive any message anymore.
After reboot the node become responsive again for a short while (and from time to time) but after I send a traceroute request to it becomes unresponsive again.
The problem also manifests even if MQTT is enabled on the device with "Proxy to client enabled" while the device is connected to the phone via BLE.

I could replicate it also on a T1000-E.

I tried to replicate this also on a Lilygo T3 S3 with and ESP32 but the results were not so straightforward so I can't confirm it also happens on devices with ESP32.

Relevant log output

No response

@iondulgheru iondulgheru added the bug Something isn't working label Nov 3, 2024
@iondulgheru iondulgheru changed the title [Bug]: NRF52 becomes unresponsive after upgrade to 2.5.8. Node is working properly if I disable Neighbour Info module [Bug]: NRF52 nodes becomes unresponsive after upgrade to 2.5.8. Node is working properly if I disable Neighbour Info module Nov 3, 2024
@caveman99
Copy link
Member

We definitely need debug logs for this one. Can't reproduce.

@iondulgheru
Copy link
Author

I connected one of the devices on usb serial when I made the tests, but I didn't see anything out of the ordinary.
When the device stopped receiving lora packets there were no logs related to receving when I sent messages to it or asked a traceroute from another device. Only some info and telemetry logs.
There are logs related to receiving lora transmissions when the device is working properly.
Because of this, I even used a HackRF to check that the sender device is really transmitting, and it was transmitting. I also send a message to another device from the same sender node and it worked without problems.
I will do another test tomorrow and copy all the logs.

@GUVWAF
Copy link
Member

GUVWAF commented Nov 4, 2024

@iondulgheru Can you try if the build from https://github.com/meshtastic/firmware/actions/runs/11671000613 (scroll to the bottom) fixes your issue?

@horstfffl
Copy link

horstfffl commented Nov 4, 2024

can confirm this bug or something similar with stable nrf52840-2.5.9.936260f on rak4631 connected via onboard usb-serial.
when connected via client.meshtastic.org devices work in the beginning.
but after a while only sending msg work (also checked with sdr.. as far as the devices will send data)
but the app doasnt show the new messages anymore.

edit: tried firmware-nrf52840-rak4631-2.5.11.772404f
sending messages may keep it alive because it didnt happen when i send msgs back and forth every 5 min or so.
but after stoping that and waiting ~1h its back to sending works in the webclient but the messages are not shown.
restarting chromium and reconnect to serial seems to bring it back even shows the last message send before restarting the webclient.
so its maybe webclient or OS related?

@koliha
Copy link

koliha commented Nov 5, 2024

I don't think this is related to the nrf52 specifically. I flashed two t-echos and a Station G2 to 2.5.9 last night. Enabled Neighbor Info on all three with a duration of 900 seconds. I didn't grasp that from 2.5.8 forward the neighbor info broadcasts only happen on MQTT. The G2 which didn't have MQTT enabled was the easiest to reproduce this on. I think what is happening is that it's having some issue when it goes to send the neighbor info and MQTT is disabled or disconnected (the t-echos were not connected via bluetooth but had mqtt enabled and had the same issue).

When this issue occurs the node sends broadcast messages, but it seems to stop broadcasting any telemetry data. It also will not ack any messages or respond to traceroutes or remote admin.

I disabled neighbor info on all of my nodes and this behavior stopped. The neighbor info code changes in 2.5.8 may need to be examined, specifically what happens when MQTT is disabled or disconnected? Not sure.
#5087

@GUVWAF
Copy link
Member

GUVWAF commented Nov 5, 2024

Someone on Discord mentions that the build I shared above fixes the issue for them (using Linux Native). If someone with an nRF52 can confirm as well, I think we can close this.

@iondulgheru
Copy link
Author

@GUVWAF, I installed your build and the problem seems gone. I will monitor it during the day.

@fifieldt
Copy link
Contributor

fifieldt commented Nov 5, 2024

fixed by #5254

@fifieldt fifieldt closed this as completed Nov 5, 2024
@iondulgheru
Copy link
Author

I can confirm that the problem is fixed, although my serial connection is a little bit unstable now, it locks sometimes, but this is not a big issue for me. By the way, I am using Linux.
Having Neighbour info enabled does not cause problems anymore, so this is good for people that upgrade the firmware but are unaware of Neighbour info changes.
After I tested the custom build I also installed now 2.5.11, and the same, I don't see any communication problems with Neighbour info module enabled.
Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants