-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Startup hangs with pyserial-asyncio-fast #184
Comments
The behaviour you describe, being quite erratic, lead me to suspect a hardware problem. I see no way that ramses_rf can cause a CPU leakage - it does not write direct to any hardware - it goes through the venerable pyserial library. There is no evidence in the screen captures you provide to justify blaming ramses_cc. I do not do any testing on Proxmox, so that's a complicating factor. Perhaps you could describe your proxmox settings w.r.t. the serial port? I have asked the community for their input: https://community.home-assistant.io/t/honeywell-ch-dhw-via-rf-evohome-sundial-hometronics-chronotherm/151584/4401 (this is the best thread, as it is much more active). But you imply that when it does work, it continues to work fine? By 'fine', I mean packets are received ongoing, until you next restart HA? Please confirm.
Above, the Thus, communication between the two is established & there is no evidence of any issue in this instance... During startup, ramses_rf will send up to 24 of these commands, up to 20 a second, and give up after that time (or 3 seconds, whichever comes first): _SIGNATURE_GAP_SECS = 0.05
_SIGNATURE_MAX_TRYS = 24
_SIGNATURE_MAX_SECS = 3 My experience is that one (or two) such echo is enough; in your case, it sent two of these only because the 2nd command was sent before the 1st echo was received - a matter of timing, and quite OK. You didn't mention seeing a message in your HA log file like so:
... or::
Please have a look for them & let me know. So, I wonder if - with a VM - these values are not generous enough, maybe: _SIGNATURE_GAP_SECS = 0.05
_SIGNATURE_MAX_TRYS = 100
_SIGNATURE_MAX_SECS = 5 I will make it so, for the next update. |
You could try running in read-only mode & seeing what that does:
|
Many thanks on the extensive reply. I'll try to chip in on all your comments/questions:
The thing is, it's very clear that it is this specific integration that causes it. I have this HA setup for quite a long time (over a year), using other USB dongles with passthrough (z-wave and SkyConnect) which work as expected, and had never anything like this. By killing the integration, HA just spins up as expected. I'm struggling to find ways of showing logs or anything that will reflect the behavior experienced, but at the moment is less erratic (which was in the beginning) and is now more consistent: if the integration is somehow enabled HA Core will just stop responding to any request after a few seconds of booting up. Update: I just bumped into this post that explains a bit more options for debugging (and that grabbed my attention for instability reports with the latest version of HA Core): Update 1: I upgraded HA Core to 2025.5.1
I've test all the options for the USB passthrough, but all return the same behavior.
Thanks, checked the one reply there and they're using the same USB stick as me.
By 'fine' I mean I see packet logs in the file, as the configuration documentation makes to expect. But I wasn't able to go any further than the log where you saw the packets flowing. Just that one time.
At this moment the only thing HA logs mentions of ramses is what is in the second screenshot I pasted in the OP. It just stops working right after.
That works! at least it did right now:
However, there's no ramses CC integration added to HA in read only mode. Some additional logs after adding a bunch of logger lines to the confgiuration.yaml file; it mentions it's using a cached schema. Is there a way of resetting that? This one a lot of times:
I also tested without the read only mode and on 2025.5.1 again, but the behavior is the same (Core stops responding). I can't get logs easily right now due to being outside the LAN. Many thanks for your support. |
I had some time to delve into this again, with weird findings. So, before HA stops responding, ramses logs appear to be as expected:
As far as I can tell, the last log entry before it stops responding is relative to ´zigpy´:
I can't understand how this is related. I use zigbee2mqtt for zigbee devices. |
So now I'm thinking maybe an infinite loop in my code? This makes sense, as it would be in the main thread (it wouldn't be a recursive loop, as that would break out with an exception). Note to self: change ramses_cc so that
There should be.
And Boom! This makes sense to me. The obvious thing for your to try is restarting HA with zigbee disabled. That information would be useful to me. I guess the solution is for ramses_cc to use pyserial-asyncio-fast and not pyserial-asyncio... It is not clear to me if HA obligates that I must be using pyserial-asyncio-fast, or not. Ihave looked around, but no information have been make available to me. |
So we have this in zigpy: try:
import serial_asyncio_fast as pyserial_asyncio
LOGGER.info("Using pyserial-asyncio-fast in place of pyserial-asyncio")
except ImportError:
import serial_asyncio as pyserial_asyncio I wonder what would happen if I do the same? |
Good enough for me then! :-)
I actually had done already, given the zigpy being there, and it didn't make a difference. In fact, given I'm using zigbee2mqtt wouldn't it make sense that no actual zigbee is beign used by HA Core? I did also disable all addons and possible integrations that would make any use of the zigbee coordinator (SkyConnect, also with Thread support), but that zigpy line still appears.
Is this something I can test? |
It all depends on how technical you are.
Or maybe wait until I have some spare time to do some testing... |
If it's editing files in the |
In 0.x.20 (i.e. 0.31.20 and 0.41.20), which uses ramses_rf version 0.31.20, ramses_rf now includes: try:
import serial_asyncio_fast as pyserial_asyncio
LOGGER.info("Using pyserial-asyncio-fast in place of pyserial-asyncio")
except ImportError:
import serial_asyncio as pyserial_asyncio Hopefully released today - YMMV. |
Thank you the the update. I've ran this with 0.31.20, first with read only - it runs though still doesn't show an integration. However, without the read only flag, HA Core behaves the same way, eventually it just stops responding to any requests and only a hard shutdown is able to solve it. Current logs (filtered by "ramses" occurences):
Here is an excerpt of the middle of the log, when ramses loads:
And the final lines of the full log:
|
Also, don't know if using 0.41.20 would help, but to try that one is there any way of setting read_only mode in the config_flow? |
Yes, you can do this.
Enter: disable_sending: true ... and click on SUBMIT. I believe you then have to:
|
I am not sure about this:
Can you try installing MQTT (even if you don't use it). Or you can edit the manifest.yaml so that it removes MQTT (only edit the {
"domain": "ramses_cc",
"name": "RAMSES RF",
"codeowners": ["@zxdavb"],
"dependencies": [],
"documentation": "https://github.com/zxdavb/ramses_cc",
"issue_tracker": "https://github.com/zxdavb/ramses_cc/issues",
... |
I am not sure about this:
... never really seen it before. |
I remembered that I have to edit the configuration.yaml to get HA core access back, so it's better to keep using 0.3x.xx while getting this isn't working properly. I don't know where to edit the configuration which will be set by the flow, but even if it's a file as well it's likely to be a lot more difficult to find ramses there than on configuration.yaml.
I have MQTT installed and running (many entities, including from Zigbee2MQTT), I guess that means it's just waiting MQTT to load (which it does).
That I can't comment on as well. THanks for your help, once again. |
Is there any configuration flag that allows clearing ramses_cc's cache? |
Thank you for the pointers, I had been through the docs, but forgot about that. Right now, I'm using:
and
In any case, I've ran this a bunch of times and even though sometimes it doesn't log zigpy, most of the times the log ends with the
I can add the following, which makes me think that this is somehow triggering some host hardware issue: as I mentioned previously I can access HA via the proxmox console just fine, and running top returns high CPU occupancy but no process with actual high occupancy. On the other hand, running top confirms that the HA VM is using up the CPU. Nonetheless, proxmox still runs very fluid. The only actual issue is that HA Core stops replying to any requests. |
Updated from proxmox 7.4 to 8.2.2, and still the same. But this could be related to proxmox. |
If you mean the USB passthrough, yes, I have tried both options. At the moment, with the following read only config I get some very bare HA logs and no packet logs:
Setting 31.20 HA log, with `disable_sending` set to `false`
Home Assistant logs below end (almost every time) with the
I'm not using versions 41.x because I don't know of an easy way of setting I can also add that with both runs above, no integration entities are created, whereas at sometime in initial tries it did (although with issues). I've just now tried out 31.16 and although it doesn't prevent HA Core from running fine, ramses_cc fails to load: 31.16 log, with `disable_sending` set to `false`
Also tried out 31.19 and the behavior is pretty much the same as 31.20. |
I am sorry - I have run out of ideas. IMO, something is happening at the (virtualised) hardware layer. The next step would be to take ProxMox completely out of the equation. |
Thank you for your effort in any case. I'll try to see if I can find something in the ProxMox communities. Would it feasible to run Ramses_cc somehow in an isolated raspberry pi zero? Maybe publishing to MQTT? |
Someone would have to write the code - it wouldn't be difficult. The other option would be to re-instate RFC2217. I don't have the time, sorry. However, what you're wanting: an MQTT version of the evofw3 dongle, is just around the corner...
|
I should still check your dongle outside of you VM environment - if it's a HW problem, then you can have it replaced. But if you're going to buy a new one, I'd wait for the MQTT version from Indalo Tech. |
Thank you for all the help so far. I'll try and check the USB dongle as you suggested first. The Indalo Tech site store has been down all day, so I haven't been able to check your suggestion. |
@zxdavb Do you know if the online shop is coming back? |
Just tested on a raspberry pi, with almost nothing other than a base HASS install, and it's the exact same behavior. I guess this an hardware fault with the dongle |
https://freewebstore.status.io/
In that case, would contact Pete (who sold you the dongle), and ask for next steps. Maybe you could raise an issue on this github repo: |
I've reached out to Peter and there's an issue with the online store. |
@zxdavb Just to let you know that this all ended up originating in the USB not being properly seated in the connectors, even though I tried a multiple of them; . I got a few verifications requests from Peter and ended up reaching that conclusion, the dongle needs to be somewhat forcibly plugged in such a position so that it connects properly. Apologies for such an underwhelming conclusion, but it is what it is. On to configuring now. |
[EDIT] This issue appears to have been caused by faulty hardware.
Describe the bug
Integration spikes the CPU just after it tries to start and the only way out is by hard resetting HASSOS. HA Core gets completely unaccessible.
I have been trying to setup ramses-cc with a indalo-tech USB stick for a heat recovery ventilation unit, and tried both 41.19 and 31.19, as well as the x.16 versions; the process has been very inconsistent given that the integration at some point actually was able to get some readings, but when after trying to use those reading to setup the known_list and schemas, it just started to crash HA. The last time I had to manually remove the integration files from the config/custom_components folder just to get my instance booting.
To Reproduce
Nothing specific, using flow with no settings (just the serial port) or with the YAML below.
Expected behavior
It should at least not have this CPU leakage.
Please complete the following information:
ramses_cc:
section from configuration.yamlThere's nothing relevant on
/mnt/data/supervisor/homeassistant/home-assistant.log
:Below is an excerpt of the one single time out of tens of tries where the integration was actually was reading communication values (in this case I had only the controller ID (18:003599) in the known_list and nothing else.
log
Additional context
HASSOS running on proxmox 7.4:
CPU(s) 4 x Intel(R) Celeron(R) N5105 @ 2.00GHz (1 Socket)
Kernel Version Linux 6.2.9-1-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.9-1 (2023-03-31T10:48Z)
PVE Manager Version pve-manager/7.4-17/513c62be
The text was updated successfully, but these errors were encountered: