So I seem to be getting somewhat random Panic resets in a device, it seems to be happening in crosscore?
Complete core dump is here: https://pastebin.com/vCGet7eL
So, the really weird thing is that this happened on devices that were at a customer's location, and two of them did this at precisely the same time, and both coredumps look nearly identical.
My guess is that there was some power glitch and it caused one of the cores to lock up?
Anyone see anything that would point to a software issue? Maybe we just need better power filtering on our inputs.
Spurious panic in crosscore
-
- Posts: 9739
- Joined: Thu Nov 26, 2015 4:08 am
Re: Spurious panic in crosscore
Not sure if you can conclude that the issue is in crosscore... all the other tasks are in there (mostly because they yielded, and that is the last position the task will be in before they get descheduled) but all we know of the crashing tasks is that they called abort; the stack trace is broken from there on.
I agree that if two boards crash in the same way at exactly the same time, some power issue may be a culprit.
I agree that if two boards crash in the same way at exactly the same time, some power issue may be a culprit.
-
- Posts: 10
- Joined: Wed May 31, 2017 4:21 pm
Re: Spurious panic in crosscore
Thanks! That makes sense.
I guess related to this, all of my stack traces seem to get corrupted in this way (our panic handler uploads the stored-on-flash coredump to the server upon reboot)-- even testing by simply triggering an assert (e.g., assert(false);) leads to the current thread stack being corrupted above the abort(). Is that common behavior in a panic, or is there some problem in my handler?
We're just uploading the full contents of the coredump partition.
I don't want to post the elf for the firmware publicly, but is there perhaps documentation on the coredump format or someone who could look at it to tell me if I'm corrupting it in some way?
I guess related to this, all of my stack traces seem to get corrupted in this way (our panic handler uploads the stored-on-flash coredump to the server upon reboot)-- even testing by simply triggering an assert (e.g., assert(false);) leads to the current thread stack being corrupted above the abort(). Is that common behavior in a panic, or is there some problem in my handler?
We're just uploading the full contents of the coredump partition.
I don't want to post the elf for the firmware publicly, but is there perhaps documentation on the coredump format or someone who could look at it to tell me if I'm corrupting it in some way?
-
- Posts: 9739
- Joined: Thu Nov 26, 2015 4:08 am
Re: Spurious panic in crosscore
I doubt it's you corrupting the coredump - if I recall correctly, it's protected by a hash or CRC or something. Can you tell me the ESP-IDF version you compile with? Could be that something is marked as 'noreturn' in that version, breaking core dumps on abort.
-
- Posts: 10
- Joined: Wed May 31, 2017 4:21 pm
Re: Spurious panic in crosscore
It appears to be the v4.0 release, looks like I should actually go ahead and move to v4.0.1, but that does not seem to solve this particular issue.
Who is online
Users browsing this forum: No registered users and 21 guests