SPI master MOSI starts later than SCK when network activity

szabiaura
Posts: 15
Joined: Wed Jul 08, 2020 5:27 am

SPI master MOSI starts later than SCK when network activity

Postby szabiaura » Wed Jan 11, 2023 1:01 am

In an application I want to send big SPI transactions of 12288 bytes in a continuous way, repeating every 40 ms. The SPI data rate is set to 2.5 MHz so there should be ~680 us between each transaction without overrun. Works like a charm.

But as soon as I start sending data (1 UDP packet of 4096 bytes every 8 ms, fragmented on the IP level) to the ESP32 over UDP to a random port over Ethernet, even if there are no sockets listening in the firmware, some SPI transactions start acting weird. I want to transmit the bit pattern 100 in a repeated way. Here's what a normal transaction looks like. Signal above is MOSI, below is SCK.
normal.png
Normal transaction
normal.png (14.08 KiB) Viewed 4276 times
But a few times every minute at irregular intervals this is what I get:
error.png
Transaction with error
error.png (12.46 KiB) Viewed 4276 times
It looks like MOSI is delayed for some reason. The buffer containing the data that is transmitted over SPI is not modified in the meantime.

Any ideas why this is happening? If I stop sending the data over Ethernet, all transactions become normal again without exception.

I'm using the ESP32-Ethernet-Kit. Also attached my code to initialize the SPI master and run the transactions with a timer.
Attachments
spi_transmitter.c
Code that initializes the SPI master peripheral and runs the transactions with a timer.
(3.13 KiB) Downloaded 344 times

yake827
Posts: 43
Joined: Mon Aug 09, 2021 7:51 am

Re: SPI master MOSI starts later than SCK when network activity

Postby yake827 » Fri Jan 13, 2023 7:06 am

I'm sorry I didn't understand your question.
Could you also output the CS signal? I want to know whether the MOSI starts late specifically because SPI has started transmission or just because SPI starts late

szabiaura
Posts: 15
Joined: Wed Jul 08, 2020 5:27 am

Re: SPI master MOSI starts later than SCK when network activity

Postby szabiaura » Fri Jan 13, 2023 3:03 pm

Hi, thank you for your answer.

Here's a capture with the CS signal included:
error2.png
error2.png (15.66 KiB) Viewed 4154 times
There are 2 more strange things I discovered.

I changed the pattern to [0b10010010, 0b01001001, 0b00100110] (so it's the bit pattern 100 repeated over 3 bytes with the exception of the last repetition which is 110). The buffer is prefilled this this pattern at initialization and not changed in any way later. This way the MOSI signal should be 7 short pulses followed by 1 long pulse. Now, if you look further into the transaction, this is what you see:
error3.png
error3.png (16.96 KiB) Viewed 4154 times
Looking closer at the +40 us line:
error4.png
error4.png (12.2 KiB) Viewed 4154 times
In the beginning the signal looks OK but then you see 1 short pulse followed by a long pulse, and then again 7 short pulses and a long pulse again. This 1 short and one long pulse is not what it should be according to the original pattern. Also note the irregular period of MOSI: the period of the last long pulse before the error is 1.188 us which is normal, but the one after (the one before the single short pulse) is 2 us. The forum won't let me upload more screenshots so I put it here: https://www.imgpaste.net/image/K6IJ9U

If you look at the next transaction, it's normal again which confirms that the pattern has not changed in memory.

Another thing I noticed is that this happens even if the data is discarded immediately after it's received from the network before LwIP sees it. To reproduce this, I pushed the minimal code example to https://github.com/szekelyisz/esp32-spibug. It's basically the basic ethernet example with the SPI code added. I also included a JavaScript file to send data. Make sure you change the IP address to the one allocated to the ESP.

To reproduce this, follow these steps:
  1. Set a breakpoint at this line: https://github.com/espressif/esp-idf/bl ... eth.c#L103
  2. Start debugging with breakpoints disabled.
  3. Let the ESP acquire an IP address.
  4. Start the JavaScript. The SPI problems described above will start showing up.
  5. Enable breakpoints. The breakpoint set in step 1 should be hit immediately.
  6. Using you debugger, set the value of `eth_driver->stack_input` to NULL. This causes all data to be properly discarded before LwIP receives it.
  7. Let the firmware run. The problems with the SPI signal happen regardless the data being discarded.
This tells me that the problem is on or very close to the hardware level. Maybe a concurrent memory access problem on the chip itself?

szabiaura
Posts: 15
Joined: Wed Jul 08, 2020 5:27 am

Re: SPI master MOSI starts later than SCK when network activity

Postby szabiaura » Fri Jan 13, 2023 3:45 pm

Here's a bit of help to catch the problem. The bit pattern is crafted so that when MOSI is wired to an addressable LED strip (WS2812B and friends) it should cause all LEDs to emit a dim light (red, green and blue all set to 1 which is the minimum brightness). When the error happens, all LEDs blink one time. This is because the bit stream that they synchronize to in the beginning is suddenly shifted after a few bytes so all values become something else (2^n where n is random according to how many bits the SPI peripheral skipped in the beginning), but then the next transaction is very likely to be normal again, so they all go back to 1.

Now you can catch the blink with a photodiode and use its signal as trigger for a logic analyzer. The wrong transaction is the one before the photodiode triggers.
error6.png
error6.png (47.29 KiB) Viewed 4148 times

szabiaura
Posts: 15
Joined: Wed Jul 08, 2020 5:27 am

Re: SPI master MOSI starts later than SCK when network activity

Postby szabiaura » Fri Jan 13, 2023 3:51 pm

One more strange thing can be seen at the end of the SPI transaction showing the problem. According to the bit pattern in memory MOSI should end in 7 short pulses followed by 1 long pulse. But this is what happens:
error7.png
error7.png (11 KiB) Viewed 4143 times

yake827
Posts: 43
Joined: Mon Aug 09, 2021 7:51 am

Re: SPI master MOSI starts later than SCK when network activity

Postby yake827 » Mon Jan 16, 2023 3:43 am

Hi, I have seen the code you uploaded and I hope you can modify the code for the following verification:
1. Take spi_device_queue_trans out of the timer interrupt function and put it in a normal task, synchronized through queue
2. After spi_device_queue_trans, call spi_device_get_trans_result to ensure that the last SPI transfer is complete before the new SPI transmit start.
If the problem persists, please send the code and the phenomenon.

szabiaura
Posts: 15
Joined: Wed Jul 08, 2020 5:27 am

Re: SPI master MOSI starts later than SCK when network activity

Postby szabiaura » Wed Jan 18, 2023 2:46 am

Hi yake827, thank you for your help. I implemented your suggestions, but the problem still persists with the exact same symptoms.

See https://github.com/szekelyisz/esp32-spibug for updated code.

ESP_ondrej
Posts: 211
Joined: Fri May 07, 2021 10:35 am

Re: SPI master MOSI starts later than SCK when network activity

Postby ESP_ondrej » Wed Jan 18, 2023 8:37 am

SPI and EMAC shares the DMA. Could you please try to decrease `dma_burst_len` in your EMAC config (`eth_esp32_emac_config_t esp32_emac_config`) to e.g. `ETH_DMA_BURST_LEN_2`?

szabiaura
Posts: 15
Joined: Wed Jul 08, 2020 5:27 am

Re: SPI master MOSI starts later than SCK when network activity

Postby szabiaura » Wed Jan 25, 2023 12:09 am

ESP_ondrej wrote:
Wed Jan 18, 2023 8:37 am
SPI and EMAC shares the DMA. Could you please try to decrease `dma_burst_len` in your EMAC config (`eth_esp32_emac_config_t esp32_emac_config`) to e.g. `ETH_DMA_BURST_LEN_2`?
Thank you guys, that fixed the issue.

Who is online

Users browsing this forum: jcolebaker and 67 guests