-
Notifications
You must be signed in to change notification settings - Fork 114
sliding_window_moving_average causes boot looping. #394
Comments
You're right about the first part, indeed only specifying But with the configuration you gave here, I don't see any boot looping. The log you provided only contains the last part of the log, if it were to crash there would be a long stacktrace printed before that. I would need that stacktrace to debug this issue. |
I'll give it another go tomorrow and post the full log. Getting late here now. |
I tried updating to the following (clean build) OTA:
I can ping the device but I receive no sensor updates from it via the log or in HA. The device is shown as offline in ESPhome and the status is 'unavailable' in HA. I waited until the log terminated and a further 5 minutes for HA to show any sign of this sensor. I removed the sliding window filters and tried re-uploading a clean build OTA. The ESP then correctly started logging to ESPhome and sending sensor readings again. There really does seem to be something amiss with the sliding window average filter. |
Update, to simplify things I tried removing the multiplication filter and the sliding average from the wifi sensor leaving just this sensor:
Same problem. The ESP never publishes any mqtt messages. It can be seen as online by ESPhome and can be pinged and updated OTA. Removing the sliding window filter and re-flashing resumes normal operation. |
I tried this using the exponential average instead.
With this result:
Show Logs:
|
I just tried a new Wemos D1 Mini board. Same result. Boot loops with |
PROGRESS! I removed the wifi and status sensors. I could then use the filtered A0 sensor. Adding either the wifi sensor or status sensor back into the configuration results in failure to connect to wifi and a watchdog time out / restart. Removing the filter from the A0 sensor allows both the status and wifi sensor to be included. Data and memory use when compiling does not seem to be the problem (about 50% and 30%). I really like to be able to monitor the state of my sensors easily and this was working in the previous version (with built-in default filtering of sensors). Any chance you could investigate this Otto? SUMMARY:
OR
Configs that upload but fail to run:
OR
|
@tomlut Ok so your logs provide some good info. Specifically, However, what is weird that nothing is displayed on boot. If the code would even reach the point where the moving average or anything else for that matter is initialized, the logs are set up. I also tried your configuration files and was not able to replicate the issue. |
Hi Otto, I have 4 new Wemos D1 minis on order. Should be a week or two away. I did try it with another D1 mini and got similar results, though thinking about it now, failure to connect to wifi might have been a router / assigned static IP issue as I was using this D1 for another sensor previously (so MAC address mismatch in the router). Doing this test also absolutely trashed my HA mqtt integration (duplicate ids / no unique ids from device). Even stopping HA and removing the entries from the .storage files did not help. I ended up having to do a snapshot restore. Next time I test this I will disable discovery. Just FYI, I also tried from the command line on Windows with similar results I’m reasonably sure it is not a faulty device as this was working with the previous version of ESPhome. Though your inability to reproduce the issue is worrying. What hardware device are you using to replicate this? |
It was a wemos d1 mini as well. What's the most weird to me is the missing |
Ok if you can use A0 with the filter and other sensors specified in my config on the same hardware then it has to be a problem with my D1 board. I’ve tried reinstalling the addon and using the alternate command line approach on a whole different platform (Windows) so that rules out a problem with my ESPhome installation. I’ll dig out the other D1 and try again tomorrow - with discovery disabled and a correct static IP assignment. Thanks for your time. |
I deleted all reference to my other (spare-ish) D1 mini from my router and uploaded this config:
And it works. No issue at all.
Link: https://community.home-assistant.io/t/how-to-remove-an-integrated-esp-device/92764 Thanks for your help Otto. |
Oh damn. Buoyed by the above success I decided to update anotehr ESP D1 mini. And immediately lost contact with it. The config:
The log:
|
FYI, you can't change the flash layout via OTA. It just re-uploads the new FW to a flash region just after the current sketch, and then overwrites it as it reboots. If you think that there might be some issue with your flash, you should hook it up to a UART and reflash it from there so that it can actually rewrite the flash map. I had to do this to get extra space on my ESP8266s with 1M of flash after disabling the SPIFFS reservation. That was fun, especially for the smart bulbs and plugs that I'd glued back together... |
I'm currently erasing all trace of this config from the esphome folder and will start again from creating a new device and uploading via serial. |
It failed. I am beginning to suspect it is not a problem with the D1 mini's but a problem associating with wifi. I'm about to run some more tests. |
Ok this is very odd. For the lounge room dht sensor D1: I just deleted the static IP mapping from my router and removed the manual IP from the D1 mini config. No success after a re-upload via serial. It would still not connect to wifi with DHCP and looped. I removed mqtt discovery from the D1 mini and re-uploaded - it worked. I put back the manual IP and static mapping in my router - it worked. I put back mqtt discovery - it failed. Log for failed attempt:
I took the mqtt discovery config out of my original problem child D1 - the master_bed sensor (and left in all the filtering and sensors that were causing trouble originally) - it connected to wifi and it did not boot loop and worked as expected. I just noticed that the mqtt broker IP address in my configs is in single quotes (the component configuration example does not have these). Removing them did not change the situation. I do not understand how having HA mqtt discovery on prevents wifi connection / causes watchdog reset if using sliding windowed averaging. This is driving me insane. |
Just out of curiosity, what happens if you set send_first_at: 15 on the sliding_window_moving_average filter? |
No change unfortunately. Serial upload of this config:
Results in:
Removing the mqtt discovery config results in a correctly operating board:
|
You see anything interesting if you set the log level to VERY_VERBOSE? |
Hang on this is even odder. If I dont define mqtt discovery are the defaults used?
I'll give verbose a go. |
Yeah, you only need to set the broker address (plus creds if required). |
So why when I actually define the defaults is it failing? Unless it's a case sensitive problem? true =/= True |
Let me try one more sensor update to be sure before closing. |
Nope. It was enabling very_verbose mode that actually fixed the problem. Ugh! |
This config, that failed (shown in post above) when using the default logger level:
Now successfully uploads and runs when the log level is set to very verbose:
I have no idea what is going on. I don't think it has anything to do with discovery or filtering. At a guess the framework for wifi on this new build could have some sort of race condition problem. But really I have no idea. |
Honestly, I'm kinda surprised that it's trying to read the DHT at the same time wifi setup is going on. Seems to me like that component shouldn't start sampling until later. Sure seems like you're running into some weird race condition with the wifi scan + sensor poll going on at the same time. Maybe @OttoWinter will have some new idea with the additional information you've collected. Just out of curiosity, does putting the DHT on a different pin change anything? I know on the ESP32s at least, some pins are shared with internal components. Not sure if that's a possibility for ESP8266s or not. |
If I'm being honest, at the moment it's too much hassle to change the DHT input and the fact that there was a problem with the A0 input on the other device tends to make me believe this is not the issue. Also the fact that changing seemingly unrelated components (even just defining default options) and even altering the log level changes the fault to me says that there is a much deeper issue and changing the input could send me up another false path. |
Oh right, this is ADC. There are a BUNCH of issues with reading the ADC while wifi is active.
One of those issues links to this page which says:
I am guessing that the ADC is used even more extensively during initial wifi setup and scanning. This suggests that setting |
Is an ADC call every 4 seconds too often though? The WDT seems to trip after about 8 seconds in the logs above. Or more importantly, why does leaving the default mqtt discovery parameters alone alleviate the problem (with the same 4 sec sample frequency)? Or setting the logging level == very_verbose? If anything this should make things worse. I've kind of found a work around - don't define default mqtt discovery parameters - that seems to be working across every device I try it on. I've tried other boards that have I2C sensors rather than A0 or DHT22 sensors. All work as long as I just let the mqtt discovery defaults define themselves. |
I agree this is a very deep issue. Most likely though it is in the closed source ESP SDK or even the silicon itself. But even if there are the ADC issues you linked to: a) they would only explain why it cannot connect, I don't think this could directly trigger the WDT And then comes the part that I cannot replicate this. I tried all of the configs above with a bunch of my own devices (nodemcu, wemos d1, sonoff, direct ESP8266 chip, etc) and could not replicate the behavior. Something truly weird is going on here :/ And I don't know how to find out what it really is. |
That's because that particular device does not have an ADC sensor (unless the ADC is being used internally to measure the RSSI?).
Did you include these config lines:
It's the common factor across all my D1 minis. Including these (superfluous) config parameters causes the boot looping for me (unless logging level is set to very verbose, then the config is not an issue). It does not matter what sensors I'm using, ADC, I2C, DTH, these lines seem to be the common factor in runtime failure. Either way, thank you for taking the time to try to replicate the issue. I may try setting discovery false when I get my next batch of D1 minis just see what happens. I learned today the perrils of using already discovered devices for testing. It was quite a chore to delete the duplicates from my HA integrations. I'll close this for now as there is nothing concrete we can do, no rhyme or reason as to why it is happening and I have a workaround (dont include unneeded default discovery parameters). Might be wise to keep it in mind for any future issues that are reported though. |
Also just FYI. I purchased all the Wemos D1 minis from the official Aliexpress store. These are not knock-off imitations. |
Operating environment/Installation (Hass.io/Docker/pip/etc.):
Hassio 0.85.1, ESPhome 1.10.0
ESP (ESP32/ESP8266/Board/Sonoff):
ESP8266 (Wemos D1 mini)
Affected component:
https://esphomelib.com/esphomeyaml/components/sensor/index.html#sensor-filters
Description of problem:
Including the recommended configuration to bring back a windowed moving average causes validation errors:
Including the full sliding window filter configuration causes boot looping:
Removing the sliding window average configuration compiles and runs as expected. Though the sensor is not averaged (as is wanted):
Problem-relevant YAML-configuration entries:
PASTE YAML FILE HERE
As per descrition above.
Traceback (if applicable):
Additional information:
The text was updated successfully, but these errors were encountered: