LWIP listening socket breaks when attempting to connect to WiFi access point

Oromis
Posts: 21
Joined: Mon Sep 25, 2017 1:44 pm

LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Oromis » Thu Aug 16, 2018 11:06 am

Hi,

our product uses the ESP32's WiFi to connect to the internet. To allow our end-customers to enter their WiFi SSID and Password, we enable both station mode and access point mode at the same time. Using LWIP, we create a socket(), bind() it to port 80, listen() and serve a configuration interface with an input form when someone connect via HTTP.

So far, so good. When we get WiFi credentials, we attempt to connect the ESP32's WiFi station to the given access point. This itself works well, it will connect successfully. The trouble is with the HTTP server running on our access point: Most of the time (not always), it stops responding to incoming requests. The ESP32's WiFi access point remains up, other devices are still connected to it, but they can't reach its port 80 any more.

Further observations:
  • The HTTP server doesn't break if the SSID provided by the user does not exist
  • The HTTP server does break if the SSID exists, but the password is wrong
  • I tried to close() the server socket and re-initialize it as soon as the ESP32's station connects to get a fresh socket, but it doesn't help either.
  • Debugging into the HTTP server process revealed that the select() we use to wait for incoming connections times out every time as soon as the station connects - just as if nobody attempted to connect.
There's a lot of code involved (obviously), but I'll try to provide the parts that are (IMO) most relevant to this problem.

HTTP server socket initialization:

Code: Select all

      struct sockaddr_in server_addr = { 0, 0, 0, 0, 0 };
      std::memset(&server_addr, 0, sizeof(struct sockaddr_in));
      server_addr.sin_family = AF_INET;
      server_addr.sin_addr.s_addr = INADDR_ANY;
      server_addr.sin_port = htons(serverPort);

      this->serverSocket = socket(AF_INET, SOCK_STREAM, 0);
      if(this->serverSocket < 0) {
        LOG_ERROR(LOG_TAG, "Failed to create server socket: %d", errno);
        return;
      }

      // Timeval struct for timeout - {secs, usecs}
      struct timeval tv = { 1, 0 };
      if(setsockopt(this->serverSocket, SOL_SOCKET, SO_RCVTIMEO, (const char*) &tv, sizeof(struct timeval)) < 0) {
        LOG_ERROR(LOG_TAG, "Failed to set socket timeout: %d", errno);
      }
      int reuseAddress = 1;
      if(setsockopt(this->serverSocket, SOL_SOCKET, SO_REUSEADDR, &reuseAddress, sizeof(reuseAddress)) < 0) {
        LOG_ERROR(LOG_TAG, "Failed to enable address reuse for socket: %d", errno);
      }

      if(bind(this->serverSocket, (struct sockaddr*) (&server_addr), sizeof(struct sockaddr)) < 0) {
        LOG_ERROR(LOG_TAG, "Failed to bind socket to port %d: %d", serverPort, errno);
        return;
      }
      if(listen(this->serverSocket, 5) < 0) {
        LOG_ERROR(LOG_TAG, "Failed to set socket to listen mode: %d", errno);
        return;
      }
Code used to accept new HTTP clients:

Code: Select all

    struct sockaddr_in client_addr = { 0, 0, 0, 0, 0 };
    socklen_t sin_size = sizeof(client_addr);
    // Create socketSet to be able to use SELECT instaed of ACCEPT to enable timeouts.
    static fd_set socketSet;
    // Initialize the set of active sockets.
    FD_ZERO (&socketSet);
    FD_SET (serverSocket, &socketSet);

    // Set timeout for select {secs, usecs}
    struct timeval timeout = { 1, 0 };

    int clientSocket = -1;

    int rv = select(serverSocket + 1, &socketSet, nullptr, nullptr, &timeout);
    if (rv < 0) {
      // Error on select.
      LOG_ERROR(LOG_TAG, "select failed with %d", errno);
      return;
    } else if (rv == 0) {
      // Timeout occurred. This is where we land after the station connects.
      return;
    } else {
      // Client actually connected -> accept,
      clientSocket = accept(serverSocket, (struct sockaddr*) &client_addr, &sin_size);
    }
    
    // Do HTTP server stuff with clientSocket ...
To me, it seems like either the socket becomes invalid or it starts to listen() to the new station mode connection...
  • Is there something fundamentally wrong with my approach?
  • Can I bind a socket to a specific interface (ESP_IF_WIFI_AP in this case)?
  • Do you need other sections of our code to find a solution?
Thank you very much for your help!

David

Oromis
Posts: 21
Joined: Mon Sep 25, 2017 1:44 pm

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Oromis » Mon Aug 27, 2018 9:17 am

Anyone? Any ideas would be greatly appreciated!

Ritesh
Posts: 1383
Joined: Tue Sep 06, 2016 9:37 am
Location: India
Contact:

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Ritesh » Mon Aug 27, 2018 6:02 pm

Hi,

Would you please provide few more details like are you using any http components like nghttp or something like third party components?

Also it will be good if you provide few more details about your issues.

ESP32 IDF Version?

If older then did you check with latest stable and master branch?

Did you check same code with simple BSD socket or Netconn socket instead of http parser?

Did you check that WiFi connection is going to re-establish again or something like that when you are facing this type of issue?

Hope this will be helpful to dig into root case for same.
Regards,
Ritesh Prajapati

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby ESP_Angus » Wed Aug 29, 2018 2:52 am

Hi David,

When WiFi connects or disconnects, DHCP gets a new lease, or the ESP32 enables/disables AP mode it can change the network interface configuration in LWIP, which can have flow-on effects. It should be possible to get the behaviour you want, though.

Ideally your could will wait for the events which indicate AP is enabled, and/or that STA has connected to an AP (or received an IP address), before the LWIP server sockets are created/bound. However I don't think this is strictly necessary when binding to all interfaces (even though you can expect some weird behaviour when interfaces are added/removed).
Can I bind a socket to a specific interface (ESP_IF_WIFI_AP in this case)?
Yes, you can do this by binding the socket to the AP's local IP address. In the code provided, you are binding to 0.0.0.0 (IPADDR_ANY), which will bind to the local address of all interfaces.
Debugging into the HTTP server process revealed that the select() we use to wait for incoming connections times out every time as soon as the station connects - just as if nobody attempted to connect.
What happens after this, do you retry the select again? If so, is that where it hangs?

Ritesh
Posts: 1383
Joined: Tue Sep 06, 2016 9:37 am
Location: India
Contact:

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Ritesh » Sat Sep 01, 2018 5:25 pm

Hi,

Is issue is still there or resolved?

Would you please provide answers of few questions which I sent into my last post if still issue is present?
Regards,
Ritesh Prajapati

Oromis
Posts: 21
Joined: Mon Sep 25, 2017 1:44 pm

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Oromis » Mon Jan 28, 2019 1:16 pm

Hi,

sorry for the super-long delay, I've been busy with more pressing issues, but yes, this still is an issue for us.

To answer the questions:
Would you please provide few more details like are you using any http components like nghttp or something like third party components?
We don't use any third-party HTTP components. Our very simple HTTP parser is custom-made, but it doesn't have anything to do with requests not arriving at the server.
ESP32 IDF Version?
We regularly update to the latest stable, the issue is still present in 3.1.2.
Did you check same code with simple BSD socket or Netconn socket instead of http parser?
I don't use HTTP Parser. As I said, I use LWIP sockets and I haven't tried any other socket implementations.
Did you check that WiFi connection is going to re-establish again or something like that when you are facing this type of issue?
I don't really understand the question. In some cases, several requests arrive at the server in quick succession after some time, so they seem to queue up without being recognized by the select() call. This might be client-side retrying, though. Browsers tend to do that.
What happens after this, do you retry the select again? If so, is that where it hangs?
Yes, I retry after a select() times out. But the re-tried selects also time out (until about one or two minutes later, when the behavior described above happens and the queued-up requests come in).
Ideally your could will wait for the events which indicate AP is enabled, and/or that STA has connected to an AP (or received an IP address), before the LWIP server sockets are created/bound. However I don't think this is strictly necessary when binding to all interfaces (even though you can expect some weird behaviour when interfaces are added/removed).
I'm not sure whether I understand that correctly. I can try waiting with the socket initialization until the AP initialization event as been emitted, but the initial initialization is already working correctly. It is only when the ESP32 attempts to connect to an AP in station mode when it's own AP stops accepting requests.

Ritesh
Posts: 1383
Joined: Tue Sep 06, 2016 9:37 am
Location: India
Contact:

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Ritesh » Mon Jan 28, 2019 6:24 pm

Oromis wrote:
Mon Jan 28, 2019 1:16 pm
Hi,

sorry for the super-long delay, I've been busy with more pressing issues, but yes, this still is an issue for us.

To answer the questions:
Would you please provide few more details like are you using any http components like nghttp or something like third party components?
We don't use any third-party HTTP components. Our very simple HTTP parser is custom-made, but it doesn't have anything to do with requests not arriving at the server.
ESP32 IDF Version?
We regularly update to the latest stable, the issue is still present in 3.1.2.
Did you check same code with simple BSD socket or Netconn socket instead of http parser?
I don't use HTTP Parser. As I said, I use LWIP sockets and I haven't tried any other socket implementations.
Did you check that WiFi connection is going to re-establish again or something like that when you are facing this type of issue?
I don't really understand the question. In some cases, several requests arrive at the server in quick succession after some time, so they seem to queue up without being recognized by the select() call. This might be client-side retrying, though. Browsers tend to do that.
What happens after this, do you retry the select again? If so, is that where it hangs?
Yes, I retry after a select() times out. But the re-tried selects also time out (until about one or two minutes later, when the behavior described above happens and the queued-up requests come in).
Ideally your could will wait for the events which indicate AP is enabled, and/or that STA has connected to an AP (or received an IP address), before the LWIP server sockets are created/bound. However I don't think this is strictly necessary when binding to all interfaces (even though you can expect some weird behaviour when interfaces are added/removed).
I'm not sure whether I understand that correctly. I can try waiting with the socket initialization until the AP initialization event as been emitted, but the initial initialization is already working correctly. It is only when the ESP32 attempts to connect to an AP in station mode when it's own AP stops accepting requests.
Hi,

So what is your plan B to overcome this issue? Or still looking for solution to come into IDF?
Regards,
Ritesh Prajapati

Oromis
Posts: 21
Joined: Mon Sep 25, 2017 1:44 pm

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Oromis » Tue Jan 29, 2019 7:58 am

My plan B is to abandon the AP mode and HTTP server altogether and use Bluetooth in combination with a custom Android / iOS app to configure WiFi credentials. This solution is a lot more work but it also avoids (legitimate) security warnings from browsers when the user tries to enter their WiFi credentials on the unencrypted (HTTP) site served by the ESP32.

I think this is either a bug in IDF or at the very least requires some documentation. I guess that the use case is not uncommon (every WiFi-based consumer IoT product needs a way to set up WiFi credentials), so some support from the platform wouldn't be a wasted effort.

Ritesh
Posts: 1383
Joined: Tue Sep 06, 2016 9:37 am
Location: India
Contact:

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Ritesh » Tue Jan 29, 2019 8:30 am

Oromis wrote:
Tue Jan 29, 2019 7:58 am
My plan B is to abandon the AP mode and HTTP server altogether and use Bluetooth in combination with a custom Android / iOS app to configure WiFi credentials. This solution is a lot more work but it also avoids (legitimate) security warnings from browsers when the user tries to enter their WiFi credentials on the unencrypted (HTTP) site served by the ESP32.

I think this is either a bug in IDF or at the very least requires some documentation. I guess that the use case is not uncommon (every WiFi-based consumer IoT product needs a way to set up WiFi credentials), so some support from the platform wouldn't be a wasted effort.
Hi,

I just want to clarify that are you facing issue every time while commissioning WiFi credentials means LWIP listening socket is going to break when attempting to connect to WiFi access point?

Also, Just FYI, There might be chances to drop messages when you are trying to do communication into both AP and STA mode at a time because WiFi Radio is common for both STA and AP Interface.
Regards,
Ritesh Prajapati

Oromis
Posts: 21
Joined: Mon Sep 25, 2017 1:44 pm

Re: LWIP listening socket breaks when attempting to connect to WiFi access point

Postby Oromis » Tue Jan 29, 2019 12:16 pm

I just want to clarify that are you facing issue every time while commissioning WiFi credentials means LWIP listening socket is going to break when attempting to connect to WiFi access point?
No, it doesn't happen every single time, but in about 90% of cases. So it very rarely works as intended.
Also, Just FYI, There might be chances to drop messages when you are trying to do communication into both AP and STA mode at a time because WiFi Radio is common for both STA and AP Interface.
Sure, I've been thinking about this as well. But is it really possible that establishing a new connection in STA mode completely stalls the AP interface for such a long time?

Who is online

Users browsing this forum: No registered users and 122 guests