Disappointing WiFi performance ESP32-S3

eriksl
Posts: 126
Joined: Thu Dec 14, 2023 3:23 pm
Location: Netherlands

Re: Disappointing WiFi performance ESP32-S3

Postby eriksl » Fri Jun 21, 2024 4:17 pm

MicroController wrote:
Fri Jun 21, 2024 1:58 pm
eriksl wrote:
Fri Jun 21, 2024 1:49 pm
Interesting, so I am getting only 1/200th of what I should be getting (~200 kb/s, 4096 byte packets)...
Not quite. The documented numbers are in mega-bits/second...
Then still there is a factor of 31 unexplained...

I'm satisfied I have it working now, at a later time, when all functionality is added and working (and that's a lot) I'll dive deeper into this. It's a bit tricky though, because I cannot measure the time spent between reception of the wifi frame and pushing the tcp/udp packet to my application using send/recv.

Does the iperf example use the POSIX api (i.e. send/recv)?

eriksl
Posts: 126
Joined: Thu Dec 14, 2023 3:23 pm
Location: Netherlands

Re: Disappointing WiFi performance ESP32-S3

Postby eriksl » Thu Jul 11, 2024 3:37 pm

Especially for this (performance testing / checking) I added some code that really does nothing more than

* receive test
- accept connection (in case of tcp)
- send block of 4k data
- wait for confirmation
- repeat

* send test
- accept connection (in case of tcp)
- receive block of 4k data
- send confirmation
- repeat

It's nothing more than the standard POSIX stuff like socket,bind,accept,send,sendto,recv,recvfrom,close.

Still the performance is horrible, worse than ESP8266 (which even includes some processing). See the next message for the source.

I have tried really everything, all tuneables I could find, including CPU speed, IRAM stuff, buffers in and out of SPIRAM and the difference is marginal. The performance is very close to what I reported earlier, dissatisfying.

The association parameters as reported by my wireless controller are exactly the same as my ESP8266.

So my question is again, is it the LWIP POSIX interface that slows things down that much? Because on the ESP8266 I am using the native LWIP interface with callbacks.

This is the ESP32 code, completely: https://github.com/eriksl/esp32 The performance testing code is in perftest.c.
This is the code for the client used, that's the same I am using for testing performance on the ESP8266: https://github.com/eriksl/e32if But any client will do as long as it connects for port 9 (discard, send test) or port 19 (chargen, receive test) and it either sends 4k blocks and waits for the text "ACK" or v.v.

eriksl
Posts: 126
Joined: Thu Dec 14, 2023 3:23 pm
Location: Netherlands

Re: Disappointing WiFi performance ESP32-S3

Postby eriksl » Thu Jul 11, 2024 3:38 pm

Code: Select all

#include <stdint.h>
#include <stdbool.h>
#include <sys/socket.h>

#include "perftest.h"
#include "string.h"
#include "cli-command.h"
#include "log.h"
#include "util.h"

static bool inited = false;

enum
{
	//malloc_type = MALLOC_CAP_INTERNAL
	malloc_type = MALLOC_CAP_SPIRAM
};

static void run_tcp_receive(void *)
{
	enum { size = 4096 };
	char *receive_buffer;
	int accept_fd;
	struct sockaddr_in6 si6_addr;
	socklen_t si6_addr_length;
	int length;
	int tcp_socket_fd;
	static const char *ack = "ACK";
	enum { attempts = 8 };
	unsigned int attempt;

	assert(inited);

	receive_buffer = heap_caps_malloc(size, malloc_type);

	memset(&si6_addr, 0, sizeof(si6_addr));
	si6_addr.sin6_family = AF_INET6;
	si6_addr.sin6_port = htons(9); // discard

	assert((accept_fd = socket(AF_INET6, SOCK_STREAM, 0)) >= 0);
	assert(bind(accept_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);
	assert(listen(accept_fd, 0) == 0);

	for(;;)
	{
		si6_addr_length = sizeof(si6_addr);

		if((tcp_socket_fd = accept(accept_fd, (struct sockaddr *)&si6_addr, &si6_addr_length)) < 0)
		{
			log_format_errno("perftest: accept fails: %d", tcp_socket_fd);
			continue;
		}

		assert(sizeof(si6_addr) >= si6_addr_length);

		for(;;)
		{
			length = recv(tcp_socket_fd, receive_buffer, size, 0);

			if(length <= 0)
			{
				log_format("perftest tcp recv: %d", length);
				break;
			}

			for(attempt = attempts; attempt > 0; attempt--)
			{
				length = send(tcp_socket_fd, ack, sizeof(ack), 0);

				if(length == sizeof(ack))
					break;

				log_format("perftest tcp send ack: %d, try %d", length, attempt);
				vTaskDelay(100 / portTICK_PERIOD_MS);
			}

			if(attempt == 0)
				log("perftest tcp send ack: no more tries");
		}

		close(tcp_socket_fd);
	}
}

static void run_tcp_send(void *)
{
	enum { size = 4096 };
	char *send_buffer;
	int accept_fd;
	struct sockaddr_in6 si6_addr;
	socklen_t si6_addr_length;
	int length;
	int tcp_socket_fd;
	static const char *ack = "ACK";
	enum { attempts = 8 };
	unsigned int attempt;

	assert(inited);

	send_buffer = heap_caps_malloc(size, malloc_type);

	memset(&si6_addr, 0, sizeof(si6_addr));
	si6_addr.sin6_family = AF_INET6;
	si6_addr.sin6_port = htons(19); // chargen

	assert((accept_fd = socket(AF_INET6, SOCK_STREAM, 0)) >= 0);
	assert(bind(accept_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);
	assert(listen(accept_fd, 0) == 0);

	for(;;)
	{
		si6_addr_length = sizeof(si6_addr);

		if((tcp_socket_fd = accept(accept_fd, (struct sockaddr *)&si6_addr, &si6_addr_length)) < 0)
		{
			log_format_errno("perftest: accept fails: %d", tcp_socket_fd);
			continue;
		}

		assert(sizeof(si6_addr) >= si6_addr_length);

		for(;;)
		{
			length = recv(tcp_socket_fd, send_buffer, sizeof(ack), 0);

			if(length <= 0)
			{
				log_format("perftest tcp revc 2: %d", length);
				break;
			}

			for(attempt = attempts; attempt > 0; attempt--)
			{
				length = send(tcp_socket_fd, send_buffer, size, 0);

				if(length == size)
					break;

				if((length < 0) && ((errno == ENOTCONN) || (errno == ECONNRESET)))
					goto abort;

				log_format_errno("perftest tcp send 2: %d, try %d", length, attempt);
				vTaskDelay(100 / portTICK_PERIOD_MS);
			}

			if(attempt == 0)
				log("perftest tcp send 2: no more tries");
		}

abort:
		close(tcp_socket_fd);
	}
}

static void run_udp_receive(void *)
{
	enum { size = 4096 };
	char *receive_buffer;
	struct sockaddr_in6 si6_addr;
	socklen_t si6_addr_length;
	int length;
	int udp_socket_fd;
	static const char *ack = "ACK";
	enum { attempts = 8 };
	unsigned int attempt;

	assert(inited);

	receive_buffer = heap_caps_malloc(size, malloc_type);

	memset(&si6_addr, 0, sizeof(si6_addr));
	si6_addr.sin6_family = AF_INET6;
	si6_addr.sin6_port = htons(9); // discard

	assert((udp_socket_fd = socket(AF_INET6, SOCK_DGRAM, 0)) >= 0);
	assert(bind(udp_socket_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);

	for(;;)
	{
		si6_addr_length = sizeof(si6_addr);

		length = recvfrom(udp_socket_fd, receive_buffer, size, 0, (struct sockaddr *)&si6_addr, &si6_addr_length);

		assert(sizeof(si6_addr) >= si6_addr_length);

		if(length <= 0)
		{
			log_format("perftest udp recv: %d", length);
			continue;
		}

		for(attempt = attempts; attempt > 0; attempt--)
		{
			length = sendto(udp_socket_fd, ack, sizeof(ack), 0, (const struct sockaddr *)&si6_addr, si6_addr_length);

			if(length == sizeof(ack))
				break;

			log_format("perftest udp send ack: %d, try %d", length, attempt);
			vTaskDelay(100 / portTICK_PERIOD_MS);
		}

		if(attempt == 0)
			log("perftest udp send ack: no more tries");
	}

	close(udp_socket_fd);
}

static void run_udp_send(void *)
{
	enum { size = 4096 };
	char *send_buffer;
	struct sockaddr_in6 si6_addr;
	socklen_t si6_addr_length;
	int length;
	int udp_socket_fd;
	static const char *ack = "ACK";
	enum { attempts = 8 };
	unsigned int attempt;

	assert(inited);

	send_buffer = heap_caps_malloc(size, malloc_type);

	memset(&si6_addr, 0, sizeof(si6_addr));
	si6_addr.sin6_family = AF_INET6;
	si6_addr.sin6_port = htons(19); // chargen

	assert((udp_socket_fd = socket(AF_INET6, SOCK_DGRAM, 0)) >= 0);
	assert(bind(udp_socket_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);

	for(;;)
	{
		si6_addr_length = sizeof(si6_addr);

		length = recvfrom(udp_socket_fd, send_buffer, sizeof(ack), 0, (struct sockaddr *)&si6_addr, &si6_addr_length);

		assert(sizeof(si6_addr) >= si6_addr_length);

		if(length <= 0)
		{
			log_format("perftest udp recv 2: %d", length);
			continue;
		}

		for(attempt = attempts; attempt > 0; attempt--)
		{
			length = sendto(udp_socket_fd, send_buffer, size, 0, (const struct sockaddr *)&si6_addr, si6_addr_length);

			if(length == size)
				break;

			log_format("perftest udp send 2: %d, try %d", length, attempt);
			vTaskDelay(100 / portTICK_PERIOD_MS);
		}

		if(attempt == 0)
			log("perftest udp send 2: no more tries");
	}

	close(udp_socket_fd);
}

void perftest_init(void)
{
	assert(!inited);

	inited = true;

	if(xTaskCreatePinnedToCore(run_tcp_receive, "perf-tcp-recv", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
		util_abort("perftest: xTaskCreatePinnedToNode tcp receive");

	if(xTaskCreatePinnedToCore(run_tcp_send, "perf-tcp-send", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
		util_abort("perftest: xTaskCreatePinnedToNode tcp send");

	if(xTaskCreatePinnedToCore(run_udp_receive, "perf-udp-recv", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
		util_abort("perftest: xTaskCreatePinnedToNode udp receive");

	if(xTaskCreatePinnedToCore(run_udp_send, "perf-udp-send", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
		util_abort("perftest: xTaskCreatePinnedToNode udp send");
}

MicroController
Posts: 1825
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: Disappointing WiFi performance ESP32-S3

Postby MicroController » Fri Jul 12, 2024 12:52 pm

You can try setting TCP_NODELAY to reduce the latency.

eriksl
Posts: 126
Joined: Thu Dec 14, 2023 3:23 pm
Location: Netherlands

Re: Disappointing WiFi performance ESP32-S3

Postby eriksl » Fri Jul 12, 2024 1:10 pm

That's something I should try indeed, forgot that as yet.

BUT. It's not going to help for UDP, for which the performance is just as horrible.

Also I am sending 4k blocks at a time. There is not much data to "delay" anyway for Nagle's Algorithm. It's meant to improve performance on really small payloads, a few bytes at most.

Still, nobody tried something similar? Just a plain set up doing nothing other than sending or receiving and see what's the performance?

eriksl
Posts: 126
Joined: Thu Dec 14, 2023 3:23 pm
Location: Netherlands

Re: Disappointing WiFi performance ESP32-S3

Postby eriksl » Fri Jul 12, 2024 1:33 pm

I created a bug report: https://github.com/espressif/esp-idf/issues/14171

I hope someone at Espressif takes notice.

MicroController
Posts: 1825
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: Disappointing WiFi performance ESP32-S3

Postby MicroController » Fri Jul 12, 2024 2:36 pm

eriksl wrote:
Fri Jul 12, 2024 1:10 pm
Also I am sending 4k blocks at a time. There is not much data to "delay" anyway for Nagle's Algorithm. It's meant to improve performance on really small payloads, a few bytes at most.
When your 4KB chunk of data gets split into (MTU-sized) TCP segments, the last segment is likely not 'full', which may cause lwip to wait a bit for more data to fill up the segment before actually sending out a half-empty one...

eriksl
Posts: 126
Joined: Thu Dec 14, 2023 3:23 pm
Location: Netherlands

Re: Disappointing WiFi performance ESP32-S3

Postby eriksl » Fri Jul 12, 2024 4:26 pm

That's not the first thing I am thinking of, especially because
- the ESP8266 doesn't show this behaviour (afaik no possibility to set nodelay in LWIP native interface)
- UDP suffers from the same.

But the proof of the pudding...

eriksl
Posts: 126
Joined: Thu Dec 14, 2023 3:23 pm
Location: Netherlands

Re: Disappointing WiFi performance ESP32-S3

Postby eriksl » Sat Jul 13, 2024 10:17 am

Tried, enable TCP_NODELAY at both sides, no difference.

robiwan
Posts: 9
Joined: Sat Dec 07, 2024 11:36 am

Re: Disappointing WiFi performance ESP32-S3

Postby robiwan » Sat Dec 07, 2024 3:57 pm

Did you ever get to the root cause? Because I am seeing something similar. Sending UDP packets every 2666us from a RPi, I get nowhere near 2666 us between received packets on the ESP32-S3.

Who is online

Users browsing this forum: No registered users and 72 guests