Page 1 of 1

Unable to separate wifi and SPI?

Posted: Wed Jan 17, 2018 5:19 pm
by nathan_whittington
I wrote some code to use hardware accelerated SPI with HSPI_HOST pins. After I got everything working I tried to move the wifi initialization stuff and ran into a crash reboot bug.

The errors are varied and don't seem to apply to the change I made. Here's one:

Code: Select all

	I (141) apa102: ... Initializing bus.
	I (142) apa102: ... Adding device bus.
	I (146) apa102: ... Transmitting.
	E (249) spi_master: spi_device_queue_trans(656): rxdata transfer > host maximum
	ESP_ERROR_CHECK failed: esp_err_t 0x102 at 0x40106ee9
	0x40106ee9: app_main at /home/na/slow/na/source/esp32/esp32_udp_apa102/main/./main.c:188 (discriminator 1)

	file: "/home/na/slow/na/source/esp32/esp32_udp_apa102/main/./main.c" line 188
	func: app_main
	expression: spi_deI (261) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
	I (922) wifi: state: init -> auth (b0)
	vice_transmit(handle, &trans_desc)

	Backtrace: 0x40087a98:0x3ffb9620 0x40087e05:0x3ffb9640 0x40106ee9:0x3ffb9660 0x400d0fb2:0x3ffb96f0
	0x40087a98: invoke_abort at /home/na/slow/na/source/esp-idf/components/esp32/./panic.c:572

	0x40087e05: _esp_error_check_failed at /home/na/slow/na/source/esp-idf/components/esp32/./panic.c:584

	0x40106ee9: app_main at /home/na/slow/na/source/esp32/esp32_udp_apa102/main/./main.c:188 (discriminator 1)

	0x400d0fb2: main_task at /home/na/slow/na/source/esp-idf/components/esp32/./cpu_start.c:449


	Rebooting...
Here's another:

Code: Select all

	I (152) apa102: ... Initializing bus.
	ESP_ERROR_CHECK failed: esp_err_t 0x101 at 0x40106e64
	0x40106e64: app_main at /home/na/slow/na/source/esp32/esp32_udp_apa102/main/./main.c:151 (discriminator 1)

	file: "/home/na/slow/na/source/esp32/esp32_udp_apa102/main/./main.c" line 151
	func: app_main
	expression: spi_bus_initialize(HSPI_HOST, &bus_config, 1)

	Backtrace: 0x40087a98:0x3ffb9550 0x40087e05:0x3ffb9570 0x40106e64:0x3ffb9590 0x400d0fba:0x3ffb96f0
	0x40087a98: invoke_abort at /home/na/slow/na/source/esp-idf/components/esp32/./panic.c:572

	0x40087e05: _esp_error_check_failed at /home/na/slow/na/source/esp-idf/components/esp32/./panic.c:584

	0x40106e64: app_main at /home/na/slow/na/source/esp32/esp32_udp_apa102/main/./main.c:151 (discriminator 1)

	0x400d0fba: main_task at /home/na/slow/na/source/esp-idf/components/esp32/./cpu_start.c:449


	Rebooting...
After some experimentation I found that initializing the wifi in a function was causing the spi to crash. If I initialize wifi in my app_main everything works fine.

I'm running on linux and I updated my toolchain and my esp-idf repo (and submodules). I've got an esp32 wroom dev board.

I stripped down the code and setup a conditional to select the two different ways of setting up the wifi so I could demonstrate the issue more clearly but just adding the curly braces caused the crash. This makes me suspect it's an issue with scope.

So here's my code. There is a compile time #define CRASH_LOOP. Enable this and the board will stop working.

Code: Select all

#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "freertos/event_groups.h"
#include <string.h>
#include <sys/socket.h>
#include "esp_wifi.h"
#include "esp_system.h"
#include "esp_types.h"
#include "esp_event.h"
#include "esp_event_loop.h"
#include "nvs_flash.h"
#include "driver/gpio.h"
#include <esp_log.h>
#include <esp_heap_caps.h>
#include <driver/spi_master.h>

static char tag[] = "apa102";

EventGroupHandle_t udp_event_group;
#define WIFI_CONNECTED_BIT BIT0

/*****************************************************************************
 *****************************************************************************/
static esp_err_t event_handler(void *ctx, system_event_t *event)
{
    switch(event->event_id) {
    case SYSTEM_EVENT_STA_START:
        esp_wifi_connect();
        break;
    case SYSTEM_EVENT_STA_DISCONNECTED:
    	ESP_LOGI(tag, "event_handler:SYSTEM_EVENT_STA_DISCONNECTED!");
        esp_wifi_connect();
        xEventGroupClearBits(udp_event_group, WIFI_CONNECTED_BIT);
        break;
    case SYSTEM_EVENT_STA_CONNECTED:
        break;
    case SYSTEM_EVENT_STA_GOT_IP:
    	ESP_LOGI(tag, "got ip:%s\n",
		ip4addr_ntoa(&event->event_info.got_ip.ip_info.ip));
    	xEventGroupSetBits(udp_event_group, WIFI_CONNECTED_BIT);
        break;
    default:
        break;
    }
    return ESP_OK;
}

/*****************************************************************************
 *****************************************************************************/
void wifi_init_sta()
{
    udp_event_group = xEventGroupCreate();
    tcpip_adapter_init();
    ESP_ERROR_CHECK(esp_event_loop_init(event_handler, NULL) );

    wifi_init_config_t cfg = WIFI_INIT_CONFIG_DEFAULT();
    ESP_ERROR_CHECK(esp_wifi_init(&cfg));
    wifi_config_t wifi_config = {
        .sta = {
            .ssid = CONFIG_WIFI_SSID,
            .password = CONFIG_WIFI_PASSWORD,
            .bssid_set = false
        },
    };

    ESP_ERROR_CHECK(esp_wifi_set_mode(WIFI_MODE_STA) );
    ESP_ERROR_CHECK(esp_wifi_set_config(ESP_IF_WIFI_STA, &wifi_config) );
    ESP_ERROR_CHECK(esp_wifi_start() );
}

/*****************************************************************************
 *****************************************************************************/
void app_main(void)
{

    esp_err_t ret = nvs_flash_init();
    if (ret == ESP_ERR_NVS_NO_FREE_PAGES) {
        ESP_ERROR_CHECK(nvs_flash_erase());
        ret = nvs_flash_init();
    }
    ESP_ERROR_CHECK( ret );

// #define CRASH_LOOP

#ifdef CRASH_LOOP
        wifi_init_sta();
#else
        //---------------------------------------------------------------- 
        // NOTE: unlike the udp_perf example, I can't move the wifi setup to a
        // function without causing the spi transmit to crash and reboot. 
        udp_event_group = xEventGroupCreate();
        ESP_ERROR_CHECK(esp_event_loop_init(event_handler, NULL) );

        tcpip_adapter_init();
        wifi_init_config_t cfg = WIFI_INIT_CONFIG_DEFAULT();
        ESP_ERROR_CHECK(esp_wifi_init(&cfg));
        wifi_config_t wifi_config = {
            .sta = {
                .ssid = CONFIG_WIFI_SSID,
                .password = CONFIG_WIFI_PASSWORD,
                .bssid_set = false
            },
        };

        ESP_ERROR_CHECK(esp_wifi_set_mode(WIFI_MODE_STA) );
        ESP_ERROR_CHECK(esp_wifi_set_config(ESP_IF_WIFI_STA, &wifi_config) );
        ESP_ERROR_CHECK(esp_wifi_start() );
        //---------------------------------------------------------------- 
#endif



    //-----------------------------------------------------------------------
    // Setup HSPI as a Master and transmit 12 bytes
    //-----------------------------------------------------------------------
    spi_bus_config_t bus_config;
    bus_config.sclk_io_num = 18; // CLK
    bus_config.mosi_io_num = 23; // MOSI
    bus_config.miso_io_num = -1; // MISO (not used)
    bus_config.quadwp_io_num = -1; // Not used
    bus_config.quadhd_io_num = -1; // Not used
    ESP_LOGI(tag, "... Initializing bus.");
    ESP_ERROR_CHECK(spi_bus_initialize(HSPI_HOST, &bus_config, 1));

    spi_device_handle_t handle;
    spi_device_interface_config_t dev_config;
    dev_config.address_bits = 0;
    dev_config.command_bits = 0;
    dev_config.dummy_bits = 0;
    dev_config.mode = 0;
    dev_config.duty_cycle_pos = 0;  // 50%
    dev_config.cs_ena_posttrans = 0;
    dev_config.cs_ena_pretrans = 0;
    dev_config.clock_speed_hz=10*1000*1000,               //Clock out at 10 MHz
    dev_config.spics_io_num = -1;
    dev_config.flags = 0;
    dev_config.queue_size = 1;
    dev_config.pre_cb = NULL;
    dev_config.post_cb = NULL;
    ESP_LOGI(tag, "... Adding device bus.");
    ESP_ERROR_CHECK(spi_bus_add_device(HSPI_HOST, &dev_config, &handle));

	spi_transaction_t trans_desc;
	trans_desc.addr = 0;
	trans_desc.cmd = 0;
	trans_desc.flags = 0;

    unsigned char *data = heap_caps_malloc(12*4, MALLOC_CAP_DMA);
    trans_desc.length = 12 * 8;
    data[0] = 0x00;
    data[1] = 0x00;
    data[2] = 0x00;
    data[3] = 0x00;
    data[4] = 0xff;
    data[5] = 0x03;
    data[6] = 0x0c;
    data[7] = 0xff;
    data[8] = 0x00;
    data[9] = 0x00;
    data[10] = 0x00;
    data[11] = 0x00;

	trans_desc.tx_buffer = data;

	ESP_LOGI(tag, "... Transmitting.");
    //------------------------------------------------------------------------

    vTaskDelay(100 / portTICK_PERIOD_MS);

    while (true) {
        ESP_ERROR_CHECK(spi_device_transmit(handle, &trans_desc));

        vTaskDelay(1000 / portTICK_PERIOD_MS);
    }
}
Why is this failing? Do I just have to live with keeping all my code in app_main?

Re: Unable to separate wifi and SPI?

Posted: Thu Jan 18, 2018 12:25 am
by ESP_Angus
Hi nathan,

Uninitialized memory bugs can be hard to pin down because they can result in all kinds of weird behaviour. But there is a hint in the log here:

Code: Select all

E (249) spi_master: spi_device_queue_trans(656): rxdata transfer > host maximum
If you grep for this error message in IDF, you'll find it in spi_master.c:

Code: Select all

    SPI_CHECK(trans_desc->rxlength <= handle->host->max_transfer_sz*8, "rxdata transfer > host maximum", ESP_ERR_INVALID_ARG);
The key is, some fields of spi trans_desc are being left uninitialized, including .rxlength. The bug is here:

Code: Select all

   spi_transaction_t trans_desc;
   trans_desc.addr = 0;
   trans_desc.cmd = 0;
   trans_desc.flags = 0;
Because C doesn't initialize automatic (stack-resident) variables. So the other fields in trans_desc are left with whatever values were in the pre-existing stack memory. Adding a function means suddenly some addresses are being pushed on the stack which weren't previously being pushed there. Bingo, new random value in the rxlength field!

A simple fix for this is:

Code: Select all

    spi_transaction_t trans_desc = { };
The { } is an empty initializer list, and it causes all fields in the struct which aren't named (ie all of them here) to be initialized to zero. You can delete the subsequent zeroing lines in this case, as the compiler will optimize them out now that it can see that it's already zeroed the whole structure.

Suggest using this pattern (explicit initializer) every time you initialize a struct on the stack, to avoid surprises (even if you're sure that you've set every field, a future version of the library may add new fields...) If the compiler can see you're redundantly zeroing the field and then setting it to a different value a few lines down, it will optimize around this so there's no performance hit.

Where you have non-zero fields, you can use C99-style initializers like this:

Code: Select all

   spi_bus_config_t bus_config [ {
       .sclk_io_num = 18, // CLK
       .mosi_io_num = 23, // MOSI
      ...etc, etc.
      };
 
(Again, any fields which aren't mentioned in the list will be automatically zeroed by the compiler.)

BTW, one more little tip, if you declare "static void wifi_init_sta()" then the compiler will probably notice that this static function is only ever called in one place, app_main(), and inline the implementation into app_main() which will reduce your code size by a small amount (in the binary, it will be as if you hadn't written a separate function). That said, if that function had been inlined originally then you wouldn't have uncovered this unitialized memory bug and it might have cropped up later in some even harder to find way...)

Angus

Re: Unable to separate wifi and SPI?

Posted: Thu Jan 18, 2018 4:42 am
by nathan_whittington
Awesome, thank you! I was unfamiliar with that syntax. Everything works ok now.