Over-the-air update: Panic happens when esp_ota_end() is called

henrysshunt
Posts: 10
Joined: Thu Jun 22, 2023 10:30 am

Over-the-air update: Panic happens when esp_ota_end() is called

Postby henrysshunt » Wed Jul 12, 2023 10:34 am

I’m adding OTA firmware update functionality over Bluetooth (BLE) and am having issues where calling esp_ota_end() constantly causes a panic. I have had this working previously during development. There seems to be no issues with starting the update and writing the chunks of firmware to flash. I have tried erasing the flash (using idf.py erase-flash), thinking that maybe something corrupted it somehow, but that didn’t help. I have also tried reading the firmware file from an SD card instead of over Bluetooth and get the same result.

I'm using the ESP32-S3 Mini with 8MB of flash and have set the partition table to factory app with two OTA partitions. At the moment I have no security features enabled. I'm using esp_ota_get_next_update_partition() to get the partition to use for the update and am using OTA_WITH_SEQUENTIAL_WRITES when beginning the update.

I get this log output when the end function is called (unsure if it's useful):

Code: Select all

I (67349) esp_image: segment 0: paddr=00110020 vaddr=3c070020 size=220ach (139436) map
Today I had a colleague flash my board with what was accidentally an unfinished version of this feature, and the firmware update ran and completed fine. He then flashed it with the latest version of our code and it didn’t work. I then flashed the same board with the same unfinished version on my computer and it didn’t work. I then had my colleague again flash the board with the unfinished version, which had worked before, and it no longer worked. I repeated this after cleaning the build as well as after erasing the flash, both times with the same result.

Here is what my stack looks like after the panic. Seems to be something to do with verifying the firmware image?
Screenshot 2023-07-11 165445.png
Screenshot 2023-07-11 165445.png (22.75 KiB) Viewed 3909 times

Does anyone have any ideas on what could be causing this?

Thanks.

ESP_Sprite
Posts: 9764
Joined: Thu Nov 26, 2015 4:08 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby ESP_Sprite » Thu Jul 13, 2023 5:25 am

What particular error does the panic throw?

henrysshunt
Posts: 10
Joined: Thu Jun 22, 2023 10:30 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby henrysshunt » Thu Jul 13, 2023 2:55 pm

So I had another look at this again today, and found that the problem has gone away on both of the boards I was using (both of which yesterday had OTA updates not working in the way described above), despite not making any further changes to try and fix this.

In terms of your question, I don't really have any particular error to report (which is why I included the stack trace). When run with the debugger, it just hangs and when paused I find it sitting in the panic handler's infinite loop. When run without the debugger, it just resets and there's a one-line mention of the panic handler but that's it.

henrysshunt
Posts: 10
Joined: Thu Jun 22, 2023 10:30 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby henrysshunt » Tue Jul 18, 2023 3:55 pm

This issue has recurred in the same was as explained above, again after having made no specific changes to the related code (as this feature had been finished and working).

I don't have any further information to give or look for on this. The call to esp_image_verify() inside esp_ota_end() is causing a panic for some reason. How should I proceed further on debugging this? If the verification is truly failing then I'm expecting an ESP_ERR_OTA_VALIDATE_FAILED return value, not a panic.

ESP_Sprite
Posts: 9764
Joined: Thu Nov 26, 2015 4:08 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby ESP_Sprite » Wed Jul 19, 2023 12:56 am

Hm. What I can see is that the panic is called by highint4, which is hooked up to the cache error handler and the interrupt watchdog. The first triggers when cache access is done while the cache is unavailable, which typically happens when writing the flash. The second is done when interrupts are disabled for too long (e.g. when an interrupt handler gets stuck). Without seeing the output of the panic handler, it's a bit hard to see which of the two it is.

You have a debugger attached, right? Can you check what the EXCAUSE register value is in the panic handler?

henrysshunt
Posts: 10
Joined: Thu Jun 22, 2023 10:30 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby henrysshunt » Wed Jul 19, 2023 9:33 am

Checking with the Eclipse debugger, EXCAUSE is set to 4, which I guess is the latter of your two points (Level1InterruptCause). Just to see if it would change anything I increased the interrupt watchdog timeout to 10 seconds, but that didn't have any impact. Let me know if there's any other info you'd like to see. The INTERRUPT register is set to 10000000011001000001100000 if that's of any use.

ESP_Sprite
Posts: 9764
Joined: Thu Nov 26, 2015 4:08 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby ESP_Sprite » Thu Jul 20, 2023 2:18 am

Interesting, that reason seems to be PANIC_RSN_MEMPROT but the level-4 interrupt handler never sets EXCCAUSE to that value. Just to verify: what ESP-IDF version are you using? Possibly something changed between versions.

henrysshunt
Posts: 10
Joined: Thu Jun 22, 2023 10:30 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby henrysshunt » Fri Jul 21, 2023 4:01 pm

I'm using ESP-IDF version 5.0.2.

henrysshunt
Posts: 10
Joined: Thu Jun 22, 2023 10:30 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby henrysshunt » Thu Oct 12, 2023 10:08 am

We're still having this issue and have made no further progress on it. Does anyone have any ideas?

Thanks.

zeboxer
Posts: 5
Joined: Wed Aug 23, 2023 8:25 am

Re: Over-the-air update: Panic happens when esp_ota_end() is called

Postby zeboxer » Mon Oct 16, 2023 7:23 am

I encountered with this issue too (ESP IDF v5.1-dirty, ESP32-S3-WROOM-1 N4R8). I have http handler for OTA update (simple rest server). Write operation works successfully. But when calls esp_ota_end app panic by WDT. I goto esp_ota_end and added debug prints. I detected that app panic at:

Code: Select all

   
   # in esp_ota_ops.c file at function esp_ota_end
   
   if (esp_image_verify(ESP_IMAGE_VERIFY, &part_pos, &data) != ESP_OK) {
        ret = ESP_ERR_OTA_VALIDATE_FAILED;

        goto cleanup;
  }
esp_image_verify calls panic by WDT.
I tried to comment out block of code above (in the component source file), but panic was call again at function esp_ota_set_boot_partition, in my code.

My firmware image size is 1.5Mb

My stdout logs:

Code: Select all

I (18772) esp-rest: Starting OTA update
I (18782) esp-rest: Running partition type 0 subtype 16 (offset 0x00020000)
I (18792) esp-rest: Writing to partition subtype 17 at offset 0x1d0000
I (18792) esp-rest: Remaining size : 1511520
I (18802) esp-rest: New firmware version: 0.1.2
I (18802) esp-rest: Running firmware version: 0.1.2
I (18812) esp-rest: esp_ota_begin succeeded
I (18892) esp-rest: Remaining size : 1510496
I (18892) esp-rest: Remaining size : 1509472
...
...
...
I (77342) esp-rest: Remaining size : 1120
I (77402) esp-rest: Remaining size : 96
I (77482) esp-rest: Written image length 1511520
I (77482) esp_image: segment 0: paddr=001d0020 vaddr=3c110020 size=49e38h (302648) map

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x8 (TG1WDT_SYS_RST),boot:0x2b (SPI_FAST_FLASH_BOOT)
Saved PC:0x42004503
0x42004503: panic_handler at /home/ESP/esp-idf/components/esp_system/port/panic_handler.c:145 (discriminator 3)

SPIWP:0xee
mode:DIO, clock div:2
load:0x3fce3810,len:0x18ec
load:0x403c9700,len:0x4
load:0x403c9704,len:0xf28
load:0x403cc700,len:0x2fcc
SHA-256 comparison failed:
Calculated: a892db2a10bcaf0be3a4d03d36dbd01129642fe329960e50b39d4b282cf6b122
Expected: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
Attempting to boot anyway...
entry 0x403c9964
I (51) boot: ESP-IDF v5.1-dirty 2nd stage bootloader
I (51) boot: compile time Oct 14 2023 13:29:32
I (52) boot: Multicore bootloader
I (55) boot: chip revision: v0.1
I (59) boot.esp32s3: Boot SPI Speed : 40MHz
I (64) boot.esp32s3: SPI Mode       : DIO
I (68) boot.esp32s3: SPI Flash Size : 4MB

W (73) boot.esp32s3: PRO CPU has been reset by WDT.
W (79) boot.esp32s3: APP CPU has been reset by WDT.

I (84) boot: Enabling RNG early entropy source...
I (90) boot: Partition Table:
I (93) boot: ## Label            Usage          Type ST Offset   Length
I (100) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (108) boot:  1 otadata          OTA data         01 00 0000f000 00002000
I (116) boot:  2 phy_init         RF data          01 01 00011000 00001000
I (123) boot:  3 ota_0            OTA app          00 10 00020000 001a9000
I (131) boot:  4 ota_1            OTA app          00 11 001d0000 001a9000
I (138) boot:  5 data_fs          Unknown data     01 82 00379000 0004b000
I (146) boot: End of partition table

I (150) esp_image: segment 0: paddr=00020020 vaddr=3c110020 size=49e38h (302648) map
I (210) esp_image: segment 1: paddr=00069e60 vaddr=3fc9bf00 size=0517ch ( 20860) load
I (214) esp_image: segment 2: paddr=0006efe4 vaddr=40374000 size=01034h (  4148) load
I (217) esp_image: segment 3: paddr=00070020 vaddr=42000020 size=10a1f0h (1090032) map
I (409) esp_image: segment 4: paddr=0017a218 vaddr=40375034 size=16e38h ( 93752) load
I (437) boot: Loaded app from partition at offset 0x20000
I (437) boot: Disabling RNG early entropy source...
I (448) cpu_start: Multicore app
I (449) octal_psram: vendor id    : 0x0d (AP)
I (449) octal_psram: dev id       : 0x02 (generation 3)
I (452) octal_psram: density      : 0x03 (64 Mbit)
I (457) octal_psram: good-die     : 0x01 (Pass)
I (463) octal_psram: Latency      : 0x01 (Fixed)
I (468) octal_psram: VCC          : 0x01 (3V)
I (473) octal_psram: SRF          : 0x01 (Fast Refresh)
I (479) octal_psram: BurstType    : 0x01 (Hybrid Wrap)
I (485) octal_psram: BurstLen     : 0x01 (32 Byte)
I (490) octal_psram: Readlatency  : 0x02 (10 cycles@Fixed)
I (496) octal_psram: DriveStrength: 0x00 (1/1)
I (502) esp_psram: Found 8MB PSRAM device
I (506) esp_psram: Speed: 40MHz
I (718) mmu_psram: Instructions copied and mapped to SPIRAM
I (719) cpu_start: Pro cpu up.
I (719) cpu_start: Starting app cpu, entry point is 0x403756e8
0x403756e8: call_start_cpu1 at /home/ESP/esp-idf/components/esp_system/port/cpu_start.c:154

I (0) cpu_start: App cpu up.
I (1357) esp_psram: SPI SRAM memory test OK
I (1367) cpu_start: Pro cpu start user code
I (1367) cpu_start: cpu freq: 160000000 Hz
I (1367) cpu_start: Application information:
...
...
Anybody have idea why this

Who is online

Users browsing this forum: Bing [Bot], Majestic-12 [Bot] and 127 guests