Dealing with a bug that persists a restart

JosuGZ
Posts: 48
Joined: Tue Jan 14, 2020 9:47 am

Dealing with a bug that persists a restart

Postby JosuGZ » Tue May 24, 2022 11:23 am

I've witnessed this behavior 4 times in several years so it is very hard to debug:

My chip is trying to read from an UART and doesn't get a response. After a while, it assumes that something is wrong and crashes, restarting again. And it happens again. And again.

Either the chip is not writing (hence not getting a response) or it is unable to read. And the big problem is that this survives my watchdog task restarting the chip.

Then, I trigger a restart with the enable pin, and the bug goes away.

Any suggestions? Perhaps this is a known bug fixed on some IDF version? It is like some bit flipped and not properly initialized after a restart.

The biggest problem is that if this happens, a manual restart is required.

boarchuz
Posts: 606
Joined: Tue Aug 21, 2018 5:28 am

Re: Dealing with a bug that persists a restart

Postby boarchuz » Tue May 24, 2022 12:30 pm

A software restart (including panic, abort, etc) doesn't reset the whole system.

Only power on, RTC WDT, and brownout will do that. You can do a 'full' reset in software by using the RTC WDT.

This is a good start: https://github.com/espressif/esp-idf/bl ... #L749-L754

(Although I should add that this might still be a bug. The UART should not break because of a restart. The above is a workaround to guarantee a clean reset.)

JosuGZ
Posts: 48
Joined: Tue Jan 14, 2020 9:47 am

Re: Dealing with a bug that persists a restart

Postby JosuGZ » Wed May 25, 2022 9:14 am

I will investigate that so I can make a full restart (can I trigger the watchdog manually?). Would a full restart like the one you mention clear the RTC_NO_INIT memory?
Although I should add that this might still be a bug. The UART should not break because of a restart
Just for clarification, the UART does not break because of a restart, it does not "heal" after the restart (the original cause of the bug that causes the UART to break is unknown). I'm on an old version so perhaps this is fixed, who knows.

boarchuz
Posts: 606
Joined: Tue Aug 21, 2018 5:28 am

Re: Dealing with a bug that persists a restart

Postby boarchuz » Wed May 25, 2022 10:33 am

JosuGZ wrote:
Wed May 25, 2022 9:14 am
can I trigger the watchdog manually?
The linked IDF snippet will do this - set the timeout to 0 to reset immediately.
JosuGZ wrote:
Wed May 25, 2022 9:14 am
Would a full restart like the one you mention clear the RTC_NO_INIT memory?
It shouldn't ever be cleared.
I'm guessing memory isn't powered down during a RTCWDT reset so I would expect whatever is there to remain intact. (Maybe someone who knows a lot more about it can answer that for sure?)
Of course you'd use a CRC or similar to check contents on reset anyway.

JosuGZ
Posts: 48
Joined: Tue Jan 14, 2020 9:47 am

Re: Dealing with a bug that persists a restart

Postby JosuGZ » Wed May 25, 2022 10:23 pm

Of course you'd use a CRC or similar to check contents on reset anyway.
I use a special 64bits key, if it is there, I know I'm dealing with data from the previous run.

JosuGZ
Posts: 48
Joined: Tue Jan 14, 2020 9:47 am

Re: Dealing with a bug that persists a restart

Postby JosuGZ » Thu May 26, 2022 2:06 pm

My particular problem was a pin staying high after a reset. Something as simple as this can reproduce it:

Code: Select all

main:
delay 10s
set pin_32 high
abort
It starts low, but then goes high and never goes back to low, both with abort, with esp_restart, and entering a critical zone so the interrupt watchdog causes a reset. Only the enable pin can get back the pin_32 to its initial configuration.

So I believe the solution with the watchdog won't work.

This probably fixed on some newer version I hope.

boarchuz
Posts: 606
Joined: Tue Aug 21, 2018 5:28 am

Re: Dealing with a bug that persists a restart

Postby boarchuz » Thu May 26, 2022 5:41 pm

If pin 32 is configured as a RTC GPIO then it should be unaffected by a software restart so that's the expected behaviour. If it's configured as a digital GPIO then it should be reset back to default.

RTC WDT will reset everything. It's functionally equivalent to toggling the enable pin.

Who is online

Users browsing this forum: benrank, Jorgen and 113 guests