-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLT/JBD & JKBMS BLE problems #86
Comments
Ok, while the BLE setup has been running for a while and was mostly stable, it has always had some moments where it would just hang. I've been thinking about a watchdog process that would monitor that there are new values from the BMS within a certain time frame. Let me know if you think this is a good idea and I might find some time in the next weeks to implement that. Due to the instability, I'll switch back to serial connection. |
I had the same thing, gone back to 1.32 |
Unfortunately I troubleshooted this over 100 hours and found no real solution to all this Bluetooth problems. Therefore I decided to not put any other effort into the Bluetooth part. Another reason is, that the users apparently do not appreciate the work I do and they cannot immagine how much time consuming this all is. For that 1-2 donations, if at all, a month on over 9.000 dbus-serialbattery installations it is not worth it. If everyone would donate 1 €/year then the motivation would be another, but that is not the case. Feel free to open a PR :-) |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Just when i see in log : Then in battery.py |
See also Louisvdw/dbus-serialbattery#1092 (comment), just for informations. |
I'm also trying v1.3.20240705 and seeing less disconnects (none so far). As commented elsewhere, the driver has some flakey logic that kills the driver on disconnect in an attempt to recover, but it seems to be making it worse, reverting some of that bandaid stuff seems prudent. |
Are you all sure, that the data is still refreshing? This version has no check for that and could be one reason, why it does not report any issue. |
You could be right, but I've been watching the cell voltages pretty consistently and seen constant changes.... BTW I tried the dev branch and also saw exits after the bluetooth daemon was attempted to be reset when run manually (after the driver crashed)...
I have a ch340g ttl coming, and will move to wired... but I think despite buggy bluetooth stack in RPI, it should be capable of continuing to work. |
I think this is the main difference in logic between the older 1.3 and the 1.5 version. result from a failure is captured in the current cycle in 1.5, but in the older 1.3 the failure is looked at in the subsequent cycle where the
whereas the older version allows the above logic to run through (tries every 500ms for 20 cycles)) before flagging result as bad.
This is important because result instantly returns a single failure, whereas in 1.3 the failure would be retried a bunch of times first... I'm not a coder and really don't know python, but this seems like an important difference. Happy to be wrong. Bluetooth can be glitchy, so retries is important, I also still think killing the bluetoothd is not the right approach. I also think the use of bluetootctl needs some reworking. bluetoothctl --timeout S should be used in scripts (bluez/bluez#826 , bluez/bluez#826 (comment) ) to limit the time it runs for, and scan on, etc can take some time to fully complete. So more cooling off period between disconnects and reconnects, with status checks for these completing etc is probably required, so the tasks bluez is handling can properly complete and avoid race conditions or whatever. |
Also check this issues, PR and discussions to get a complete overview: |
not that it's terribly helpful, but i can add that the 20240705 version (mostly) reliably gets data from the BLE on the varter heated battery i have installed in my application. i don't see any significant issues when running this version, that i can identify with my limited experience with this technology. |
Okay, so I managed to track down more info on this segmentation fault....
and now have the following data from the failure:
I though this pointed to the function -> merge_dicts, but have found it varies from failure to failure.....
I've added some logging to it to see what else might pop up. The dictionary info looks like this:
Dict2 keys vary from call to call.
Still a segmentation fault, but different is some ways... |
I may be chasing down a rabbit hole, but have come to the conclusion that use of raspberry pi ble and wifi seems problematic, potentially bluez issue compounding it, but use of external ble adapter is probably good advice [EDIT: confirmed still having frequent dropouts with two different usb adapters]. I'm still against the nuclear option of killing bluetoothd as a regular way of handling connection issues as bad practice, maybe include as a config option for people who are impacted and this is the only way to recover, still a band-aid for underlying bug/s in bluez/bleak. Handling the connected state and riding out the issues with disconnect and reconnect seems more prudent, with a backoff mechanism, and retries (with logging). bluetooth isn't serial, disconnections/drop outs etc are part of life with BLE. Anecdotally the older logic in v1.3 does seem more forgiving and have had it continuing to work for 24+ hours... (confirmed updates coming into the gui - I didn't get into adding code to capture drop outs and recoveries, but this should ideally be logged). https://bleak.readthedocs.io/en/latest/troubleshooting.html#occasional-not-connected-errors-on-raspberry-pi NB: the logic in enable.sh to disable internal BT depends on /u-boot/config.txt having "dtoverlay=miniuart-bt" somewhere, currently that is not the case, at least not on my fairly fresh install of venus os 3.5. Some enhancement to append the disable-bt to the overlay is required instead: existing:
proposed:
|
Describe the bug
When running v1.4.20240928, I regularly get segfaults in the python binary using a JKBMS via BLE.
I've downgraded again to v1.3.20240705 and this one has run fine since a day now. I know this is hard to debug, but I'm happy to help. If you know how to enable core dumps on this, it would be helpful, otherwise I can research it.
How to reproduce
Install latest release, wait for a certain time (few hours), segfaults regularly occur.
Expected behavior
No segfaults
Driver version of the currently installed driver
v1.4.20240928
Driver version of the last known working driver
v1.3.20240705
Venus OS device type
Raspberry Pi 4
Venus OS version
v3.42
BMS type
JKBMS (Heltec BMS)
Cell count
16
Battery count
2
Connection type
Bluetooth
Config file
Relevant log output
Any other information that may be helpful
No response
The text was updated successfully, but these errors were encountered: