Let's talk wifi and TCP

User avatar
fly135
Posts: 606
Joined: Wed Jan 03, 2018 8:33 pm
Location: Orlando, FL

Let's talk wifi and TCP

Postby fly135 » Tue May 01, 2018 5:30 pm

I work as a consultant from home on the east coast for a company on the west coast. When we do an OTA update mine takes about a minute while their update takes about 5 minutes assuming it doesn't error out on the download. Their wifi is slammed and has lots of congestion being in an office building. My wifi is not, working from home. So my task is figuring out how to handle congested wifi in an uncongested environment.

The device I'm working on has like the wrover a 4MB spiram chip. I'm thinking that wifi/tcp may could use more buffers to handle retransmissions and all that network magic that goes on to make tcp reliable. There are two unchecked options in LWIP section of menuconfig...

[ ] Enable fragment outgoing IP packets
[ ] Enable reassembly incoming fragmented IP packets

I would have thought that fragmented IP packets was something that was going to happen and am wondering why these options aren't checked by default. Am I wrong about this?

In the wifi section of menuconfig there is this....

(10) Max number of WiFi static RX buffers
(32) Max number of WiFi dynamic RX buffers
Type of WiFi TX buffers (Static) --->
(16) Max number of WiFi static TX buffers

If I remove the limit of the number of dynamic RX buffers by setting to zero, will these buffers be allocated from spiram? Same with adding more static RX and TX buffers. I can't afford to lose 8 bit internal ram so more buffers need to come from ext ram.

Looking for sage advice! :)

John A

User avatar
fly135
Posts: 606
Joined: Wed Jan 03, 2018 8:33 pm
Location: Orlando, FL

Re: Let's talk wifi and TCP

Postby fly135 » Tue May 01, 2018 6:05 pm

More on this...

I set (16) Max number of WiFi static TX buffers to 32, and I was failing to allocate some task mem on startup.

After setting that back to 16 my app started up fine.

I also set

(32) Max number of WiFi dynamic RX buffers -> 0 to allow unlimited allocation.

In addition I set both TCP fragment configs to enabled....

[*] Enable fragment outgoing IP packets
[*] Enable reassembly incoming fragmented IP packets

The interesting result of this is that it appears to allocate more of the internal 32 bit mem but less of the 8 bit internal mem, which is a good thing if I'm reading the heap report correctly.

Now to see if it improves congested wifi performance.

****Edit***: my comment above about using more 32 and less 8 bit was wrong. Didn't let the app settle first. The changes of TCP fragmentation and allowing unlimited dynamic RX buffers appears to have no effect on the available heap after startup. Not sure about under heavy wifi load.

John A

User avatar
hassan789
Posts: 156
Joined: Thu Jun 29, 2017 2:15 am

Re: Let's talk wifi and TCP

Postby hassan789 » Fri May 04, 2018 1:09 am

try the "ipref" example in both locations, and see your throughput. The default configs in ESP-IDF v3.1 have LWIP running from ram, as well as Tickrate at 1kHz, along with ethernet frame optimizations. you wont get anything better than that.

User avatar
fly135
Posts: 606
Joined: Wed Jan 03, 2018 8:33 pm
Location: Orlando, FL

Re: Let's talk wifi and TCP

Postby fly135 » Thu May 10, 2018 5:34 pm

hassan789 wrote:try the "ipref" example in both locations, and see your throughput. The default configs in ESP-IDF v3.1 have LWIP running from ram, as well as Tickrate at 1kHz, along with ethernet frame optimizations. you wont get anything better than that.
I just saw your post. I'll take a look at using that as a test program. I'm less interested in speed and more interested in reliability. OTA downloads are about 1.5MB and my sound file uploads are about 1/4MB. In a congested situation these xfers will stall out and eventually the server will disconnect (usually 5 mins from a stalled xfer).

Has anyone played with "esp_wifi_set_max_tx_power", although the default is allow max power. Or in the config structure there is tx_ba_win and rx_ba_win, which is the ACK window sizes. I may be beating a dead horse, but wifi performance in a busy office setting appears to be pretty bad.

John A

PatrikB
Posts: 16
Joined: Tue Aug 07, 2018 7:43 pm

Re: Let's talk wifi and TCP

Postby PatrikB » Tue May 21, 2019 7:31 pm

Hi,

Are having the same issue, did you manage to solve it?

BR
Patrik

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Let's talk wifi and TCP

Postby ESP_Angus » Wed May 22, 2019 1:06 am

fly135 wrote:
Thu May 10, 2018 5:34 pm
I may be beating a dead horse, but wifi performance in a busy office setting appears to be pretty bad.
Hi John,

Just to check all the variables: how is 2.4GHz WiFi performance for other devices in the same environment? ie if you connect a PC to the 2.4GHz network (ie same router, same frequency band) and download the same OTA file, how does performance compare?

If I had to guess, TCP fast retransmits are failing so packets are falling back to slow TCP retransmits. This means whole seconds when no data is being sent. You may be able to tune this on the server side as well.

OTA is particularly nasty environment for the TCP connection because the ESP32 can't service the TCP connections during flash erase/write, so timeouts will happen more frequently than for non-OTA TCP throughput.

If you can get packet captures then this can help a lot. The ESP32's TCP connection is probably the most useful, although this can be difficult without changing the "native" environment if you don't have a router that can set up port mirroring or something similar. Maybe more viable to packet capture from the OTA server side, to see what it sees? Although this is only part of the picture.

User avatar
fly135
Posts: 606
Joined: Wed Jan 03, 2018 8:33 pm
Location: Orlando, FL

Re: Let's talk wifi and TCP

Postby fly135 » Wed May 22, 2019 4:09 pm

ESP_Angus wrote:
Wed May 22, 2019 1:06 am
fly135 wrote:
Thu May 10, 2018 5:34 pm
I may be beating a dead horse, but wifi performance in a busy office setting appears to be pretty bad.
Hi John,

Just to check all the variables: how is 2.4GHz WiFi performance for other devices in the same environment? ie if you connect a PC to the 2.4GHz network (ie same router, same frequency band) and download the same OTA file, how does performance compare?

If I had to guess, TCP fast retransmits are failing so packets are falling back to slow TCP retransmits. This means whole seconds when no data is being sent. You may be able to tune this on the server side as well.

OTA is particularly nasty environment for the TCP connection because the ESP32 can't service the TCP connections during flash erase/write, so timeouts will happen more frequently than for non-OTA TCP throughput.

If you can get packet captures then this can help a lot. The ESP32's TCP connection is probably the most useful, although this can be difficult without changing the "native" environment if you don't have a router that can set up port mirroring or something similar. Maybe more viable to packet capture from the OTA server side, to see what it sees? Although this is only part of the picture.
Yes, it is OTA's where the problem is most apparent. The company I work for has a bunch of APs that show up and they report a lot of random OTA failures in the office. Unfortunately (or fortunately for me) I work from home on the opposite coast. I never experience OTA failures. I have everything I need to do port mirroring here, but also have good wifi connectivity. The timeout on the OTA server was 5 min. The OTA would get a random percent done, then the ESP32 receive would hang until it timed out. When you guys introduced the ESP_HTTPS_OTA functions I switched to that, but lost granularity in the error reporting. However, it did seem that reliability improved with that.

We messed around with testing different antennas, but eventually decided the wrover module with built in antenna was as or more reliable than anything else. I put a retry mechanism in place for OTA's with an exponential backoff (i.e. 5 min, 25 min, 125 min) and nobody seems to be complaining now.

They were comparing wifi reliability with a previous TI based design, which did perform better in their tests using Smokeping.

John A

Who is online

Users browsing this forum: axellin and 100 guests