How to debug an "just after bootloader" hang?

PanicanWhyasker
Posts: 45
Joined: Sun Jan 06, 2019 12:42 pm

How to debug an "just after bootloader" hang?

Postby PanicanWhyasker » Thu Sep 03, 2020 10:00 pm

Hi,

I have a very unusual scenario of a device which has been sent to a client, was working flawlessly for several months, and then just went dead out of the blue. It's stuck just after the bootloader - when I connect to the serial port, I see this:

Code: Select all

ets Jun  8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:152
load:0x40078000,len:10012
load:0x40080400,len:5984
entry 0x4008064c
ets Jun  8 2016 00:22:57

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:152
load:0x40078000,len:10012
load:0x40080400,len:5984
entry 0x4008064c
There are 9 seconds of delay between the two resets, and indeed I've configured my bootloader to have 9000 ms timeout for RTC watchdog.
So it seems it the main program is loaded and hangs. I have many devices like the bricked one, on the working ones the bootup messages are exactly the same, but the main program runs.

My question is: is it possible to get more debug info? What does it try to do after the "entry 0x4008064c" line? What region of the flash does it execute?

System info:
- ESP IDF v3.2
- Custom ESP PCB, but it has worked solidly on 100s of devices over 2 years
- Flash is encrypted. The bootloader, factory app and OTA_1/OTA_2 seem well (ciphered gibberish, entropy near 1)
- Same devices as the one mentioned in Device bricked ("csum err") after two months of service thread, but these are the new revision, the stable one. Hence I'm quite curious what went wrong this time.
- It's just one device. We'll be fine to throw it out if it's undebuggable. Pursuing mostly out of curiosity and for possible reliability improvement.

PanicanWhyasker
Posts: 45
Joined: Sun Jan 06, 2019 12:42 pm

Re: How to debug an "just after bootloader" hang?

Postby PanicanWhyasker » Tue Sep 08, 2020 10:23 am

Any ideas?

ESP_Mahavir
Posts: 190
Joined: Wed Jan 24, 2018 6:51 am

Re: How to debug an "just after bootloader" hang?

Postby ESP_Mahavir » Tue Sep 08, 2020 11:45 am

Hello @PanicanWhyasker,
$ ./components/esptool_py/esptool/esptool.py --chip esp32 image_info bootloader.bin
esptool.py v3.0-dev
Image version: 1
Entry point: 40080688
4 segments

Segment 1: len 0x00004 load 0x3fff0030 file_offs 0x00000018 [BYTE_ACCESSIBLE, DRAM, DIRAM_DRAM]
Segment 2: len 0x01be8 load 0x3fff0034 file_offs 0x00000024 [BYTE_ACCESSIBLE, DRAM, DIRAM_DRAM]
Segment 3: len 0x03580 load 0x40078000 file_offs 0x00001c14 [CACHE_APP]
Segment 4: len 0x00fa0 load 0x40080400 file_offs 0x0000519c [IRAM]
Checksum: aa (valid)
Validation Hash: ee0b590d4bb6fc6f03ffff7ed227a9fdce31a47f111abb5132bad1f1227a50f7 (valid)
If you execute similar command on your side then it will provide information on various loadable segments in 2nd stage bootloader image. Entry point here signifies the location where ROM bootloader will jump (set PC to that value) after loading this image into internal memory.

My suspicion here is 2nd stage bootloader is probably stuck somewhere and eventually triggering RTC WDT reset. Can you please try to enable debug logs in 2nd stage bootloader and see if it helps to get additional debug information (config option `CONFIG_LOG_BOOTLOADER_LEVEL_DEBUG`)?

Mahavir

ESP_Sprite
Posts: 9723
Joined: Thu Nov 26, 2015 4:08 am

Re: How to debug an "just after bootloader" hang?

Postby ESP_Sprite » Tue Sep 08, 2020 3:10 pm

Wrt why it broke: might be a problem with the QIO pins on the flash. If memory serves, the ROM code uses standard SPI, but the bootloader switches that over to QPI if configured.

Who is online

Users browsing this forum: No registered users and 100 guests