Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trashage of the heap #1516

Closed
davepl opened this issue Jul 4, 2021 · 8 comments · Fixed by #1517
Closed

Trashage of the heap #1516

davepl opened this issue Jul 4, 2021 · 8 comments · Fixed by #1517
Assignees
Labels

Comments

@davepl
Copy link

davepl commented Jul 4, 2021

In Irrecv.cpp, you poke values into rawbuf[rawlen] in many cases. However, you only allocate 100 bytes. When rawlen gets to be 100, you still do it. So you actually need to allocate one more byte than you do (or stop writing past the end of the buffer, but its easier to allocate an extra byte).

I've confirmed this change locally stops it from overwriting the heap tag.

Thanks!

To fix this, in IRRecv::IRecv, change

params.rawbuf = new uint16_t[bufsize];

to

params.rawbuf = new uint16_t[bufsize+1];

@crankyoldgit crankyoldgit self-assigned this Jul 5, 2021
@crankyoldgit
Copy link
Owner

Thanks for the report, I'll look it to it soon.

@crankyoldgit
Copy link
Owner

crankyoldgit commented Jul 5, 2021

Hmm. I'm not sure you're correct, but I'll look at writing some unit tests to confirm/verify that if you disagree or have proof.

The code that adds to the array/memory and stops the capture process is:

uint16_t rawlen = params.rawlen;
if (rawlen >= params.bufsize) {
params.overflow = true;
params.rcvstate = kStopState;
}
if (params.rcvstate == kStopState) return;
if (params.rcvstate == kIdleState) {
params.rcvstate = kMarkState;
params.rawbuf[rawlen] = 1;
} else {
if (now < start)
params.rawbuf[rawlen] = (UINT32_MAX - start + now) / kRawTick;
else
params.rawbuf[rawlen] = (now - start) / kRawTick;
}
params.rawlen++;

Let's assume params.bufsize is 100 per your example, and params.rawlen is 98. i.e. we are nearing the end of the buffer, but not there yet, and we are still capturing data (i.e. we haven't timed out), thus params.rcvstate is kMarkState.
We both agree that params.rawbuf[99] is the last entry (a uint16_t) of the array, beyond that we are stomping over memory.

Okay, stepping through the code:
We're at 98 ... and we get a interrupt for new data, thus execute gpio_intr() (which contains that above code snippet).

  1. 98 is less than bufsize (100) so we skip the if (rawlen >= params.bufsize) {} clause.
  2. The rcvstate is kMarkState so we skip the (params.rcvstate == kStopState) {} clause.
  3. Next, if (params.rcvstate == kIdleState) fails, so we take the else path. This path puts a value in params.rawbuf[rawlen] i.e. params.rawbuf[98] which is fine/safe.
  4. Next, we rawlen++, so rawlen is now 99.
  5. We basically exit the routine as we've now handled that interrupt.

Now we're at 99 ... and we get another interrupt for new data, thus execute gpio_intr() again.

  1. 99 is less than bufsize (100) so we skip the if (rawlen >= params.bufsize) {} clause.
  2. The rcvstate is kMarkState so we skip the (params.rcvstate == kStopState) {} clause.
  3. Next, if (params.rcvstate == kIdleState) fails, so we take the else path. This path puts a value in params.rawbuf[rawlen] i.e. params.rawbuf[99] which is fine/safe, and is the last safe entry in the array etc.
  4. Next, we rawlen++, so rawlen is now 100.
  5. We basically exit the routine as we've now handled that interrupt.

Now we're at 100, the buffer is technically already full ... and we get another interrupt for new data, thus execute gpio_intr() again.

  1. 100 is equal to bufsize (100) so we execute the if (rawlen >= params.bufsize) {} clause this time. i.e. Set an overflow flag, and rcvstate is set to kStopState;
  2. The rcvstate is now kStopState so we execute the (params.rcvstate == kStopState) {} clause. i.e. return.
    There is no further writing to rawbuf or incrementing rawlen. It's locked at 100 (i.e. bufsize)

So, the end result of filling the buffer is rawlen gets the value of bufsize, but the last write to rawbuf is rawbuf[rawlen - 1] when using the final value of rawlen. i.e. In this case rawbuf[100 - 1] or rawbuf[99], which is safe for a 100 entry array.

Am I missing something here?

@crankyoldgit crankyoldgit removed the bug label Jul 5, 2021
@davepl
Copy link
Author

davepl commented Jul 5, 2021 via email

@crankyoldgit
Copy link
Owner

crankyoldgit commented Jul 5, 2021

Yep. I see it now. I forgot that I had moved some of the code out of the interrupt routines into other places, to reduce IRAM usaged. i.e. into IRrecv::decode(). Thus I didn't even think to check decode() because nothing other than the interrupt routines, IRrecv::copyIrParams(), & IRrecv::crudeNoiseFilter() should write to a rawbuf. Fix incoming. Thanks for the second set of eyes.

@davepl
Copy link
Author

davepl commented Jul 5, 2021 via email

crankyoldgit added a commit that referenced this issue Jul 5, 2021
* Fix an issue where we write past the end of the capture buffer when it is full. Two options to fix this:
  1. Extend all capture buffers by 1 entry. i.e. upto 4 bytes of extra unused heap and some FLASH/PROGMEM bytes. _or_
  2. Skip the memory write when we have overflowed. i.e. Possibly slightly more than 4 bytes of FLASH/PROGMEM used.
  - CPU overhead should be about the same.
  - Given heap & memory is a more critical resource than Flash/PROGMEM, opting for Option 2.

TODO: Add unit tests to confirm this works and never happens again.

Kudos to @davepl for reporting the issue and diagnosing the offending line of code.

Fixes #1516
@crankyoldgit
Copy link
Owner

@davepl Please checkout/try branch https://github.com/crankyoldgit/IRremoteESP8266/tree/Issue1516 and let me know if it works for you?
See 2988443 for info on the coding decision. Your suggested solution is probably safer in the balance of things. i.e. in case there is another similar situation. However the pressure on usage of memory vs Flash leans me to the path I've taken. However, I'm really on the fence. If it wasn't on an imbedded system, I'd do both. ;-)

Now to write some tests so this never happens again.... sigh

crankyoldgit added a commit that referenced this issue Jul 5, 2021
So we can be sure it is fixed, and it doesn't happen again.
* Add a helper method `IRrecv::_getParamsPtr` to access `params` in Unit tests.

Fixes #1516
@crankyoldgit
Copy link
Owner

And now with unit tests confirming that this particular issue in decode() has been squashed. Phew.

crankyoldgit added a commit that referenced this issue Jul 6, 2021
…1517)

* Fix an issue where we write past the end of the capture buffer when it is full. Two options to fix this:
  1. Extend all capture buffers by 1 entry. i.e. upto 4 bytes of extra unused heap and some FLASH/PROGMEM bytes. _or_
  2. Skip the memory write when we have overflowed. i.e. Possibly slightly more than 4 bytes of FLASH/PROGMEM used.
  - CPU overhead should be about the same.
  - Given heap & memory is a more critical resource than Flash/PROGMEM, opting for Option 2.
* Add a helper method `IRrecv::_getParamsPtr` to access `params` in Unit tests.
* Unit tests so we can be sure it is fixed, and it doesn't happen again.

Kudos to @davepl for reporting the issue and diagnosing the offending line of code.

Fixes #1516
crankyoldgit added a commit that referenced this issue Jul 6, 2021
_v2.7.19 (20210706)_

**[Bug Fixes]**
- Illegal Heap write in rawbuf when the capture has overflowed. (#1516 #1517)
- PANASONIC_AC: Fix Low and High fan speeds (#1515)
- Fix MDNS in IRServer and IRMQTTServer example code (#1498 #1499)
- IRac: Fix off-by-one error in Coolix's sleep setting. (#1500)
- Fix undefined constant (#1490)

**[Features]**
- Add detailed support for Kelon ACs (#1494)
- Experimental basic support for Teknopoint A/C protocol (#1486 #1504)
- Daikin64: Add support for Heat mode (#1492)
- Basic support for `HAIER_AC176` 176 bit protocol. (#1480 #1481)

**[Misc]**
- GREE: Update inter-message gap timing (#1508 #1509)
- IRac: Change Coolix to send special messages after a normal message. (#1501 #1502)
- Fix compiler warnings causing Travis failures. (#1491)
- Update supported model info (#1477 #1485 #1488 #1489)
- Add HTML viewport meta tag to IRServer and IRMQTTServer examples (#1467 #1469)
crankyoldgit added a commit that referenced this issue Jul 6, 2021
* Regenerate Doxygen documentation

* v2.7.19 release
_v2.7.19 (20210706)_

**[Bug Fixes]**
- Illegal Heap write in rawbuf when the capture has overflowed. (#1516 #1517)
- PANASONIC_AC: Fix Low and High fan speeds (#1515)
- Fix MDNS in IRServer and IRMQTTServer example code (#1498 #1499)
- IRac: Fix off-by-one error in Coolix's sleep setting. (#1500)
- Fix undefined constant (#1490)

**[Features]**
- Add detailed support for Kelon ACs (#1494)
- Experimental basic support for Teknopoint A/C protocol (#1486 #1504)
- Daikin64: Add support for Heat mode (#1492)
- Basic support for `HAIER_AC176` 176 bit protocol. (#1480 #1481)

**[Misc]**
- GREE: Update inter-message gap timing (#1508 #1509)
- IRac: Change Coolix to send special messages after a normal message. (#1501 #1502)
- Fix compiler warnings causing Travis failures. (#1491)
- Update supported model info (#1477 #1485 #1488 #1489)
- Add HTML viewport meta tag to IRServer and IRMQTTServer examples (#1467 #1469)
@crankyoldgit
Copy link
Owner

FYI, the committed & merged changes have been included in the newly released version of the library. i.e. v2.7.19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants