ESP IDF MQTT connection lost after few hours

Cimby1
Posts: 22
Joined: Thu Aug 22, 2024 12:56 pm

Re: ESP IDF MQTT connection lost after few hours

Postby Cimby1 » Fri Oct 18, 2024 9:25 am

Where should I look for it?
At my host that runs mosquitto ? Or in my ESP mqtt configuration ?
I reckon I don’t use TLS, because in MQTT Explorer where I monitor topics, messages, etc I disabled the TLS connection.

Side note:
I did a test again without task halt and it failed in about 8 hours.

nopnop2002
Posts: 112
Joined: Thu Oct 03, 2019 10:52 pm
Contact:

Re: ESP IDF MQTT connection lost after few hours

Postby nopnop2002 » Fri Oct 18, 2024 10:23 am

>I disabled the TLS connection.

you are right.

You are not using SSL.

Code: Select all

    esp_mqtt_client_config_t mqtt_cfg = {
        .broker.address.uri = MQTT_BROKER,
        .credentials.username = MQTT_USER,
        .credentials.authentication.password = MQTT_PASS,
        .network.disable_auto_reconnect = false,
        .network.reconnect_timeout_ms = 5000,
    };
This error appears to be using SSL.

Code: Select all

E (42766620) esp-tls: [sock=54] connect() error: Host is unreachable

Cimby1
Posts: 22
Joined: Thu Aug 22, 2024 12:56 pm

Re: ESP IDF MQTT connection lost after few hours

Postby Cimby1 » Fri Oct 18, 2024 11:18 am

Then the TAG of the error message is misleading.
Which direction should I go in debugging?
Should I set more parameters in MQTT config?

nopnop2002
Posts: 112
Joined: Thu Oct 03, 2019 10:52 pm
Contact:

Re: ESP IDF MQTT connection lost after few hours

Postby nopnop2002 » Fri Oct 18, 2024 7:32 pm

>Then the TAG of the error message is misleading.

Yes, that's right.

>Should I set more parameters in MQTT config?

I have never used this parameter.

Code: Select all

        .network.disable_auto_reconnect = false,
        .network.reconnect_timeout_ms = 5000,

Cimby1
Posts: 22
Joined: Thu Aug 22, 2024 12:56 pm

Re: ESP IDF MQTT connection lost after few hours

Postby Cimby1 » Sat Oct 19, 2024 8:09 am

Do you mind giving me a proper mqtt config that is working as expected ?
Maybe this is the problem.

nopnop2002
Posts: 112
Joined: Thu Oct 03, 2019 10:52 pm
Contact:

Re: ESP IDF MQTT connection lost after few hours

Postby nopnop2002 » Sat Oct 19, 2024 9:51 am

>Do you mind giving me a proper mqtt config that is working as expected ?

The appropriate settings depend on the MQTT server you use.

It is necessary to determine whether this problem is caused by the MQTT server side or the MQTT client side.

One approach is to use a plain local MQTT server.

Cimby1
Posts: 22
Joined: Thu Aug 22, 2024 12:56 pm

Re: ESP IDF MQTT connection lost after few hours

Postby Cimby1 » Sat Oct 19, 2024 10:03 am

I have my own server running in a docker, on local hardware. I mentioned it earlier.
I want to debug this issue.

nopnop2002
Posts: 112
Joined: Thu Oct 03, 2019 10:52 pm
Contact:

Re: ESP IDF MQTT connection lost after few hours

Postby nopnop2002 » Sat Oct 19, 2024 3:22 pm

Are there any useful logs on the server side when the error occurred on the ESP32 side?

Can you enable detailed logging on the server side?

Cimby1
Posts: 22
Joined: Thu Aug 22, 2024 12:56 pm

Re: ESP IDF MQTT connection lost after few hours

Postby Cimby1 » Sat Oct 19, 2024 4:25 pm

I added warning and error logging for the future to my mosquitto. We will see.
But it is only happening with this code. I am running another very simple code without tasks. That is going for more than 2 day now.

I removed the 2 lines from my mqtt conf that you mentioned, going for another test.

Cimby1
Posts: 22
Joined: Thu Aug 22, 2024 12:56 pm

Re: ESP IDF MQTT connection lost after few hours

Postby Cimby1 » Mon Oct 21, 2024 9:15 am

Finally I was (or the nature was) able to reproduce the error.
So my village have frequent power outage, only for 2-3 seconds. This interval is enough to restart my WIFI router, but it is not able to shutdown my ESP nor my MQTT host, because these are running on an UPS.
Today I was working on my ESP developing code and watching the console messages when my WIFI went down.
The error message was the following:

Code: Select all

I (343572) cJSON: JSON data published to topic 'comfortzone/veml7700', msg_id=65012
I (347432) wifi:bcn_timeout,ap_probe_send_start
I (349932) wifi:ap_probe_send over, resett wifi status to disassoc
I (349932) wifi:state: run -> init (0xc800)
I (349932) wifi:pm stop, total sleep time: 292482116 us / 342527560 us
I (349942) wifi:<ba-del>idx:0, tid:0
I (349942) wifi:new:<6,0>, old:<6,0>, ap:<255,255>, sta:<6,0>, prof:1, snd_ch_cfg:0x0
E (349952) transport_base: poll_read select error 113, errno = Software caused connection abort, fd = 54
I (349952) wifi station: retry to connect to the AP
E (349962) mqtt_client: Poll read error: 119, aborting connection
I (349962) wifi station: connect to the AP fail
I (349972) mqtt: MQTT_EVENT_DISCONNECTED
I (352362) wifi station: retry to connect to the AP
I (352372) wifi station: connect to the AP fail
I (352582) Simple Task: Simple task is running!
I (352582) Simple Task: Counter 22
I (354782) wifi station: retry to connect to the AP
I (354782) wifi station: connect to the AP fail
I (355662) Bme Task: Sampling data from BME680...
I (355802) cJSON: JSON data published to topic 'comfortzone/bme680', msg_id=53414
I (357192) wifi station: retry to connect to the AP
I (357192) wifi station: connect to the AP fail
I (358572) Veml Task: Sampling data from VEML7700...
I (358672) Veml Task: Sampling data from VEML7700...
I (358772) Veml Task: Sampling data from VEML7700...
I (358872) Veml Task: Sampling data from VEML7700...
I (358972) Veml Task: Sampling data from VEML7700...
I (359072) Veml Task: Sampling data from VEML7700...
I (359172) Veml Task: Sampling data from VEML7700...
I (359272) Veml Task: Sampling data from VEML7700...
I (359372) Veml Task: Sampling data from VEML7700...
I (359472) Veml Task: Sampling data from VEML7700...
I (359572) cJSON: JSON data published to topic 'comfortzone/veml7700', msg_id=5361
I (359602) wifi station: retry to connect to the AP
I (359602) wifi station: connect to the AP fail
I (359982) mqtt: MQTT_EVENT_BEFORE_CONNECT
E (359982) esp-tls: [sock=54] connect() error: Host is unreachable
E (359982) transport_base: Failed to open a new connection: 32772
E (359982) mqtt_client: Error transport connect
I (359992) mqtt: MQTT_EVENT_ERROR
I (359992) mqtt: MQTT_EVENT_DISCONNECTED
I (362022) wifi station: connect to the AP fail
I (367582) Simple Task: Simple task is running!
I (367582) Simple Task: Counter 23
I (370002) mqtt: MQTT_EVENT_BEFORE_CONNECT
E (370002) esp-tls: [sock=54] connect() error: Host is unreachable
E (370002) transport_base: Failed to open a new connection: 32772
E (370002) mqtt_client: Error transport connect
I (370012) mqtt: MQTT_EVENT_ERROR
I (370012) mqtt: MQTT_EVENT_DISCONNECTED
I (370802) Bme Task: Sampling data from BME680...
I (370942) cJSON: JSON data published to topic 'comfortzone/bme680', msg_id=26854
I (374572) Veml Task: Sampling data from VEML7700...
I (374672) Veml Task: Sampling data from VEML7700...
I (374772) Veml Task: Sampling data from VEML7700...
I (374872) Veml Task: Sampling data from VEML7700...
I (374972) Veml Task: Sampling data from VEML7700...
I (375072) Veml Task: Sampling data from VEML7700...
I (375172) Veml Task: Sampling data from VEML7700...
I (375272) Veml Task: Sampling data from VEML7700...
I (375372) Veml Task: Sampling data from VEML7700...
I (375472) Veml Task: Sampling data from VEML7700...
I (375572) cJSON: JSON data published to topic 'comfortzone/veml7700', msg_id=8532
I (380022) mqtt: MQTT_EVENT_BEFORE_CONNECT
E (380022) esp-tls: [sock=54] connect() error: Host is unreachable
E (380022) transport_base: Failed to open a new connection: 32772
E (380022) mqtt_client: Error transport connect
I (380032) mqtt: MQTT_EVENT_ERROR
I (380032) mqtt: MQTT_EVENT_DISCONNECTED
For better reading, this is where the power went down:
I (347432) wifi:bcn_timeout,ap_probe_send_start
Wifi.c trying to connect back to the router:
I (359602) wifi station: retry to connect to the AP
I (359602) wifi station: connect to the AP fail
Same error as last time:
I (380022) mqtt: MQTT_EVENT_BEFORE_CONNECT
E (380022) esp-tls: [sock=54] connect() error: Host is unreachable
E (380022) transport_base: Failed to open a new connection: 32772
E (380022) mqtt_client: Error transport connect
I (380032) mqtt: MQTT_EVENT_ERROR
I (380032) mqtt: MQTT_EVENT_DISCONNECTED
My wifi source file is from the sample code provided by espressif.

Code: Select all

#include <string.h>
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "freertos/event_groups.h"
#include "esp_system.h"
#include "esp_wifi.h"
#include "esp_event.h"
#include "esp_log.h"
#include "nvs_flash.h"

#include "lwip/err.h"
#include "lwip/sys.h"

/* The examples use WiFi configuration that you can set via project configuration menu

   If you'd rather not, just change the below entries to strings with
   the config you want - ie #define EXAMPLE_WIFI_SSID "mywifissid"
*/
#define EXAMPLE_ESP_WIFI_SSID "ssidname"
#define EXAMPLE_ESP_WIFI_PASS "mypassword"
#define EXAMPLE_ESP_MAXIMUM_RETRY 5

#if CONFIG_ESP_WPA3_SAE_PWE_HUNT_AND_PECK
#define ESP_WIFI_SAE_MODE WPA3_SAE_PWE_HUNT_AND_PECK
#define EXAMPLE_H2E_IDENTIFIER ""
#elif CONFIG_ESP_WPA3_SAE_PWE_HASH_TO_ELEMENT
#define ESP_WIFI_SAE_MODE WPA3_SAE_PWE_HASH_TO_ELEMENT
#define EXAMPLE_H2E_IDENTIFIER CONFIG_ESP_WIFI_PW_ID
#elif CONFIG_ESP_WPA3_SAE_PWE_BOTH
#define ESP_WIFI_SAE_MODE WPA3_SAE_PWE_BOTH
#define EXAMPLE_H2E_IDENTIFIER CONFIG_ESP_WIFI_PW_ID
#endif
#if CONFIG_ESP_WIFI_AUTH_OPEN
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_OPEN
#elif CONFIG_ESP_WIFI_AUTH_WEP
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_WEP
#elif CONFIG_ESP_WIFI_AUTH_WPA_PSK
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_WPA_PSK
#elif CONFIG_ESP_WIFI_AUTH_WPA2_PSK
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_WPA2_PSK
#elif CONFIG_ESP_WIFI_AUTH_WPA_WPA2_PSK
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_WPA_WPA2_PSK
#elif CONFIG_ESP_WIFI_AUTH_WPA3_PSK
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_WPA3_PSK
#elif CONFIG_ESP_WIFI_AUTH_WPA2_WPA3_PSK
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_WPA2_WPA3_PSK
#elif CONFIG_ESP_WIFI_AUTH_WAPI_PSK
#define ESP_WIFI_SCAN_AUTH_MODE_THRESHOLD WIFI_AUTH_WAPI_PSK
#endif

/* FreeRTOS event group to signal when we are connected*/
static EventGroupHandle_t s_wifi_event_group;

/* The event group allows multiple bits for each event, but we only care about two events:
 * - we are connected to the AP with an IP
 * - we failed to connect after the maximum amount of retries */
#define WIFI_CONNECTED_BIT BIT0
#define WIFI_FAIL_BIT BIT1

static const char *TAG = "wifi station";

static int s_retry_num = 0;

static void event_handler(void *arg, esp_event_base_t event_base,
                          int32_t event_id, void *event_data)
{
    if (event_base == WIFI_EVENT && event_id == WIFI_EVENT_STA_START)
    {
        esp_wifi_connect();
    }
    else if (event_base == WIFI_EVENT && event_id == WIFI_EVENT_STA_DISCONNECTED)
    {
        if (s_retry_num < EXAMPLE_ESP_MAXIMUM_RETRY)
        {
            esp_wifi_connect();
            s_retry_num++;
            ESP_LOGI(TAG, "retry to connect to the AP");
        }
        else
        {
            xEventGroupSetBits(s_wifi_event_group, WIFI_FAIL_BIT);
        }
        ESP_LOGI(TAG, "connect to the AP fail");
    }
    else if (event_base == IP_EVENT && event_id == IP_EVENT_STA_GOT_IP)
    {
        ip_event_got_ip_t *event = (ip_event_got_ip_t *)event_data;
        ESP_LOGI(TAG, "got ip:" IPSTR, IP2STR(&event->ip_info.ip));
        s_retry_num = 0;
        xEventGroupSetBits(s_wifi_event_group, WIFI_CONNECTED_BIT);
    }
}

void wifi_init_sta(void)
{
    s_wifi_event_group = xEventGroupCreate();

    ESP_ERROR_CHECK(esp_netif_init());

    ESP_ERROR_CHECK(esp_event_loop_create_default());
    esp_netif_create_default_wifi_sta();

    wifi_init_config_t cfg = WIFI_INIT_CONFIG_DEFAULT();
    ESP_ERROR_CHECK(esp_wifi_init(&cfg));

    esp_event_handler_instance_t instance_any_id;
    esp_event_handler_instance_t instance_got_ip;
    ESP_ERROR_CHECK(esp_event_handler_instance_register(WIFI_EVENT,
                                                        ESP_EVENT_ANY_ID,
                                                        &event_handler,
                                                        NULL,
                                                        &instance_any_id));
    ESP_ERROR_CHECK(esp_event_handler_instance_register(IP_EVENT,
                                                        IP_EVENT_STA_GOT_IP,
                                                        &event_handler,
                                                        NULL,
                                                        &instance_got_ip));

    wifi_config_t wifi_config = {
        .sta = {
            .ssid = EXAMPLE_ESP_WIFI_SSID,
            .password = EXAMPLE_ESP_WIFI_PASS,
            /* Authmode threshold resets to WPA2 as default if password matches WPA2 standards (password len => 8).
             * If you want to connect the device to deprecated WEP/WPA networks, Please set the threshold value
             * to WIFI_AUTH_WEP/WIFI_AUTH_WPA_PSK and set the password with length and format matching to
             * WIFI_AUTH_WEP/WIFI_AUTH_WPA_PSK standards.
             */
            .scan_method = WIFI_ALL_CHANNEL_SCAN,
            .threshold.authmode = WIFI_AUTH_WPA2_PSK,
            .sae_pwe_h2e = WPA3_SAE_PWE_BOTH,
            //.sae_h2e_identifier = CONFIG_ESP_WIFI_PW_ID,
        },
    };
    ESP_ERROR_CHECK(esp_wifi_set_mode(WIFI_MODE_STA));
    ESP_ERROR_CHECK(esp_wifi_set_config(WIFI_IF_STA, &wifi_config));
    ESP_ERROR_CHECK(esp_wifi_start());

    ESP_LOGI(TAG, "wifi_init_sta finished.");

    /* Waiting until either the connection is established (WIFI_CONNECTED_BIT) or connection failed for the maximum
     * number of re-tries (WIFI_FAIL_BIT). The bits are set by event_handler() (see above) */
    EventBits_t bits = xEventGroupWaitBits(s_wifi_event_group,
                                           WIFI_CONNECTED_BIT | WIFI_FAIL_BIT,
                                           pdFALSE,
                                           pdFALSE,
                                           portMAX_DELAY);

    /* xEventGroupWaitBits() returns the bits before the call returned, hence we can test which event actually
     * happened. */
    if (bits & WIFI_CONNECTED_BIT)
    {
        ESP_LOGI(TAG, "connected to ap SSID:%s password:%s",
                 EXAMPLE_ESP_WIFI_SSID, EXAMPLE_ESP_WIFI_PASS);
    }
    else if (bits & WIFI_FAIL_BIT)
    {
        ESP_LOGI(TAG, "Failed to connect to SSID:%s, password:%s",
                 EXAMPLE_ESP_WIFI_SSID, EXAMPLE_ESP_WIFI_PASS);
    }
    else
    {
        ESP_LOGE(TAG, "UNEXPECTED EVENT");
    }
}
My guess is that I ran out of maximum retry, and that's why I can't publish mqtt messages because the ESP is not connected to the network.

Now it's time to fix this issue in the wifi config, any advice how should I ?

I don't want to give infinity attemps, because in my mind this project should work like this:
  1. Give my project to someone who wants to use it, with default credentials, etc.
  2. The esp can't connect for the first time obviously to anything.
  3. So there is an app called ESP Touch which provides a solution to connect to a new wifi network via a smartphone.

Atleast I want to implement this feature in the future, but I have low knowledge on ESP Touch. I only find examples on the touch input of the ESP and not the application ESP Touch.

I hope it is enough explanation on what did I found and how I'd like to solve it.

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 122 guests