Monitor hangs decoding core dump

felixcollins
Posts: 125
Joined: Fri May 24, 2019 2:02 am

Monitor hangs decoding core dump

Postby felixcollins » Thu Nov 02, 2023 2:37 am

I'm running IDF v4.4.6-98-g5f257494c5 With ADF.

When I trigger a coredump (using an assert(0) in code), I get the back trace of the faulting task and I see that the coredump is being received but then the monitor output stops and never resumes. The coredump is not decoded. How can I fix this or diagnose what is going wrong?

Alternatively, if I set up saving the coredump to flash, how can I retrieve it and decode it with 4.4.6 tools?

Thanks,
Felix

Below is an example of the monitor output.

Code: Select all

Backtrace: 0x40081fb2:0x3ffbfc70 0x4008b805:0x3ffbfc90 0x40093a01:0x3ffbfcb0 0x400ddd72:0x3ffbfdd0 0x400dde5f:0x3ffbfdf0 0x40176373:0x3ffbfe20 0x40176440:0x3ffbfe70
0x40081fb2: panic_abort at C:/Users/felix/espclean/esp-idf/components/esp_system/panic.c:408

0x4008b805: esp_system_abort at C:/Users/felix/espclean/esp-idf/components/esp_system/esp_system.c:137

0x40093a01: __assert_func at C:/Users/felix/espclean/esp-idf/components/newlib/assert.c:85

0x400ddd72: do_volume_plus_long_click at C:/Users/felix/source/beakbox/Firmware/beakbox3/main/buttons.c:72

0x400dde5f: buttons_event_handler at C:/Users/felix/source/beakbox/Firmware/beakbox3/main/buttons.c:335
 (inlined by) buttons_event_handler at C:/Users/felix/source/beakbox/Firmware/beakbox3/main/buttons.c:313

0x40176373: handler_execute at C:/Users/felix/espclean/esp-idf/components/esp_event/esp_event.c:145
 (inlined by) esp_event_loop_run at C:/Users/felix/espclean/esp-idf/components/esp_event/esp_event.c:590

0x40176440: esp_event_loop_run_task at C:/Users/felix/espclean/esp-idf/components/esp_event/esp_event.c:115 (discriminator 15)





ELF file SHA256: 8a7e2d8d8397f99a

Initiating core dump!
I (8748) esp_core_dump_uart: Press Enter to print core dump to UART...
I (8755) esp_core_dump_uart: Print core dump to uart...
Core dump started (further output muted)
Received  28 kB...
Core dump finished!

felixcollins
Posts: 125
Joined: Fri May 24, 2019 2:02 am

Re: Monitor hangs decoding core dump

Postby felixcollins » Thu Nov 02, 2023 4:12 am

I read the doc on espcoredump.py, set up my code and partition table to store a coredump to flash and tried that, but it crashes.

Code: Select all

C:\Users\felix\source\beakbox\Firmware\beakbox3>c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\Scripts\python.exe C:\Users\felix\espclean\esp-idf\components\espcoredump\espcoredump.py --port COM10 info_corefile C:\Users\felix\source\beakbox\Firmware\beakbox3\build\beakbox3.elf
espcoredump.py v0.4-dev
INFO: Invoke parttool to read image.
Traceback (most recent call last):
  File "C:\Users\felix\espclean\esp-idf\components\partition_table\parttool.py", line 365, in <module>
    main()
  File "C:\Users\felix\espclean\esp-idf\components\partition_table\parttool.py", line 358, in main
    op(**common_args)
  File "C:\Users\felix\espclean\esp-idf\components\partition_table\parttool.py", line 193, in _read_partition
    target.read_partition(partition_id, output)
  File "C:\Users\felix\espclean\esp-idf\components\partition_table\parttool.py", line 169, in read_partition
    partition = self.get_partition_info(partition_id)
  File "C:\Users\felix\espclean\esp-idf\components\partition_table\parttool.py", line 151, in get_partition_info
    partition = partition[0]
IndexError: list index out of range
ERROR: parttool script execution failed with err 1
INFO: b'esptool.py v3.3.4-dev\r\nSerial port COM10\r\nConnecting....\r\nDetecting chip type... Unsupported detection protocol, switching and trying again...\r\nConnecting....\r\nDetecting chip type... ESP32\r\nChip is ESP32-D0WD-V3 (revision v3.1)\r\nFeatures: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None\r\nCrystal is 40MHz\r\nMAC: 08:b6:1f:fd:bc:c0\r\nUploading stub...\r\nRunning stub...\r\nStub running...\r\n3072 (100 %)\r\n3072 (100 %)\r\nRead 3072 bytes at 0x8000 in 0.3 seconds (80.4 kbit/s)...\r\nHard resetting via RTS pin...\r\nRunning c:\\Users\\felix\\espclean\\tools\\python_env\\idf4.4_py3.8_env\\Scripts\\python.exe C:\\Users\\felix\\espclean\\esp-idf\\components\\esptool_py\\esptool\\esptool.py --port COM10 read_flash 32768 3072 C:\\Users\\felix\\AppData\\Local\\Temp\\tmp0rhl07ze...\r\n'
ERROR: Error during the subprocess execution
Traceback (most recent call last):
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\espcoredump.py", line 387, in <module>
    temp_core_files = info_corefile()
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\espcoredump.py", line 181, in info_corefile
    core_elf_path, target, temp_files = get_core_dump_elf(e_machine=exe_elf.e_machine)
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\espcoredump.py", line 66, in get_core_dump_elf
    loader = ESPCoreDumpFlashLoader(args.off, args.chip, port=args.port, baud=args.baud)
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\corefile\loader.py", line 449, in __init__
    self.target = self._load_core_src()
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\corefile\loader.py", line 164, in _load_core_src
    _header = EspCoreDumpV1Header.parse(coredump_bytes)  # first we use V1 format to get version
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 288, in parse
    return self.parse_stream(io.BytesIO(data), **contextkw)
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 300, in parse_stream
    return self._parsereport(stream, context, "(parsing)")
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 312, in _parsereport
    obj = self._parse(stream, context, path)
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 1982, in _parse
    subobj = sc._parsereport(stream, context, path)
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 312, in _parsereport
    obj = self._parse(stream, context, path)
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 2440, in _parse
    return self.subcon._parsereport(stream, context, path)
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 312, in _parsereport
    obj = self._parse(stream, context, path)
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 1020, in _parse
    data = stream_read(stream, self.length, path)
  File "c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\lib\site-packages\construct\core.py", line 92, in stream_read
    raise StreamError("stream read less than specified amount, expected %d, found %d" % (length, len(data)), path=path)
construct.core.StreamError: Error in path (parsing) -> tot_len
stream read less than specified amount, expected 4, found 0

ESP_pdragun
Posts: 12
Joined: Fri Dec 02, 2022 2:18 pm

Re: Monitor hangs decoding core dump

Postby ESP_pdragun » Thu Nov 02, 2023 2:53 pm

Hi,
the error from `espcoredump.py` usually means that the coredump partition was not found. This can be caused by e.g. having the non-default offset of the partition table (!=0x8000) or by some misconfiguration of the partition table. The bug for non-default partition offset was fixed in the master but unfortunately wasn't yet backported to the 4.4 branch.
About the first issue with sending the coredump to UART and decoding using the IDF monitor, I was not able to reproduce the issue. Can you please provide some minimal reproducible example? I have tried to modify our hello_world example with mentioned `assert(0)` and the output was as expected - for both UART and flash options of storing the coredump. From your output, it seems that the coredump was captured correctly and the monitor has started executing the coredump decoding. But in case of failure in decoding, it should still print the undecoded coredump.

Maybe you could provide some additional info like which chip have you used, project configuration, etc. I was using esp32c3 with the default settings from the example (except setting the coredump data destination to UART/flash).

Thanks,
Peter

felixcollins
Posts: 125
Joined: Fri May 24, 2019 2:02 am

Re: Monitor hangs decoding core dump

Postby felixcollins » Fri Nov 03, 2023 12:33 am

I'm less worried about the monitor hang if I can get the flash core dump to work.

Thanks for that info. That sort of information, if not the fix, should be back ported to the 4.4 branch as that is what is shipped with ADF.

I had moved the part table up. I shifted it back to 0x8000 and got the output below. So it seems there are more bugs...

The supported IDF for the master ADF is shown as 5.1. I have had no luck trying to update the IDF submodule though as it messes with all the python environment and component bits and break my build.

Is there any way to get a working stand-alone core dump decoding tool?
Thanks for your help, Felix

Code: Select all

C:\Users\felix\source\beakbox\Firmware\beakbox3>c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\Scripts\python.exe C:\Users\felix\espclean\esp-idf\components\espcoredump\espcoredump.py --port COM10 info_corefile C:\Users\felix\source\beakbox\Firmware\beakbox3\build\beakbox3.elf
espcoredump.py v0.4-dev
INFO: Invoke parttool to read image.
INFO: esptool.py v3.3.4-dev
Serial port COM10
Connecting.......
Detecting chip type... Unsupported detection protocol, switching and trying again...
Connecting....
Detecting chip type... ESP32
Chip is ESP32-D0WD-V3 (revision v3.1)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: 08:b6:1f:fd:bc:c0
Uploading stub...
Running stub...
Stub running...
3072 (100 %)
3072 (100 %)
Read 3072 bytes at 0x8000 in 0.3 seconds (81.0 kbit/s)...
Hard resetting via RTS pin...
esptool.py v3.3.4-dev
Serial port COM10
Connecting....
Detecting chip type... Unsupported detection protocol, switching and trying again...
Connecting.....
Detecting chip type... ESP32
Chip is ESP32-D0WD-V3 (revision v3.1)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: 08:b6:1f:fd:bc:c0
Uploading stub...
Running stub...
Stub running...
65536 (100 %)
65536 (100 %)
Read 65536 bytes at 0xc000 in 5.9 seconds (88.3 kbit/s)...
Hard resetting via RTS pin...
Running c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\Scripts\python.exe C:\Users\felix\esp\esp-idf\components\esptool_py\esptool\esptool.py --port COM10 read_flash 32768 3072 C:\Users\felix\AppData\Local\Temp\tmpw24s3_7y...
Running c:\Users\felix\espclean\tools\python_env\idf4.4_py3.8_env\Scripts\python.exe C:\Users\felix\esp\esp-idf\components\esptool_py\esptool\esptool.py --port COM10 read_flash 49152 65536 C:\Users\felix\AppData\Local\Temp\tmpwme_9__9...
Read partition 'coredump' contents from device at offset 0xc000 to file 'C:\Users\felix\AppData\Local\Temp\tmpwme_9__9'

===============================================================
==================== ESP32 CORE DUMP START ====================

Crashed task handle: 0x3ffc0358, name: '', GDB name: 'process 1073480536'

================== CURRENT THREAD REGISTERS ===================
exccause       0x1d (StoreProhibitedCause)
excvaddr       0x0
epc1           0x40100860
epc2           0x0
epc3           0x0
epc4           0x4008f2bd
epc5           0x0
epc6           0x0
eps2           0x0
eps3           0x0
eps4           0x60723
eps5           0x0
eps6           0x0


==================== CURRENT THREAD STACK =====================


======================== THREADS INFO =========================

Traceback (most recent call last):
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\espcoredump.py", line 387, in <module>
    temp_core_files = info_corefile()
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\espcoredump.py", line 237, in info_corefile
    threads, _ = gdb.get_thread_info()
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\corefile\gdb.py", line 113, in get_thread_info
    result = self._gdbmi_run_cmd_get_one_response('-thread-info', 'done', 'result')['payload']
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\corefile\gdb.py", line 79, in _gdbmi_run_cmd_get_one_response
    return self._gdbmi_run_cmd_get_responses(cmd, resp_message, resp_type, multiple=False)[0]
  File "C:\Users\felix\espclean\esp-idf\components\espcoredump\corefile\gdb.py", line 72, in _gdbmi_run_cmd_get_responses
    raise ESPCoreDumpError("Couldn't find response with message '{}', type '{}' in responses '{}'".format(
corefile.ESPCoreDumpError: Couldn't find response with message 'done', type 'result' in responses '[]'

felixcollins
Posts: 125
Joined: Fri May 24, 2019 2:02 am

Re: Monitor hangs decoding core dump

Postby felixcollins » Fri Nov 03, 2023 12:59 am

I managed to get it to work with flashed coredump using the dbg_corefile option. Arguably a more useful tool anyway.

ESP_pdragun
Posts: 12
Joined: Fri Dec 02, 2022 2:18 pm

Re: Monitor hangs decoding core dump

Postby ESP_pdragun » Fri Nov 03, 2023 9:29 am

That sort of information, if not the fix, should be back ported to the 4.4 branch as that is what is shipped with ADF.
Thanks for your suggestion, I will see what can be done regarding the backporting, but we will try to add some warning at least.
I had moved the part table up. I shifted it back to 0x8000 and got the output below. So it seems there are more bugs...
I am sorry, this is also a known issue, see my next comment with explanation how to use the latest tool to avoid such issues. In this case it is likely just a timeout issue of gdbmi and rerunning the command usually helps. You can also increase the timeout using '--gdb-timeout-sec' argument, something like 3 seconds should work reliably in most cases.
Is there any way to get a working stand-alone core dump decoding tool?
Actually yes, there is! We have released esp-coredump as separate python package, so you can install it using:

Code: Select all

pip install esp-coredump
After that you can write a wrapper which will execute the script, see e.g. https://github.com/espressif/esp-coredump#examples . Please note that in you case you would need to pass 'parttable_off' argument as well, if you would not need that the script is also available without wrapper as a cli command (see 'esp-coredump --help'). This way you would get the latest fixes for the tool and should be able to invoke it even with older version of IDF (the backwards compatibility is not guaranteed to 4.4 (it is for 5.0+), but I believe there should not be any differences).

Please let me know if you need any additional help.
Peter

felixcollins
Posts: 125
Joined: Fri May 24, 2019 2:02 am

Re: Monitor hangs decoding core dump

Postby felixcollins » Sun Nov 05, 2023 9:00 pm

Thanks Peter, That is very helpful!

I actually managed to track down the bug I was chasing using the GDB load of the flashed core dump. See this post...
viewtopic.php?f=20&t=36599&p=122894#p122894

The core dump revealed the function that had hung and inspection of the code revealed the bug. The ADF claims to support running with IDF 5.1 so I'll try to update to that. I've installed the libs via the VSCODE extension which makes it harder to run mixed versions.

ESP_pdragun
Posts: 12
Joined: Fri Dec 02, 2022 2:18 pm

Re: Monitor hangs decoding core dump

Postby ESP_pdragun » Tue Nov 14, 2023 9:59 am

Hi Felix,

I am happy to inform you that the fix for the non-default partition table offset was backported to 4.4 in commit `4ef81211` and should be available in the next bugfix release - 4.4.7 and also soon in 5.1.

Who is online

Users browsing this forum: Bing [Bot] and 112 guests