[IDFGH-2892] Endless SPI w/ DMA receive problem

NiclasH
Posts: 8
Joined: Mon Dec 28, 2015 4:34 am

[IDFGH-2892] Endless SPI w/ DMA receive problem

Postby NiclasH » Sun Mar 15, 2020 1:01 pm

Hi,
This issue is driving me bonkers.

I am reading an ADC128S102 (A/D converter) in endless mode. So, I set up VSPI with a descriptor to point back to itself, syncing up a CS via MCPWM Timer1 (should not be needed, but primarily to get stable oscilloscope readings).

The data that is sent out is very straight forward, "Next channel" is sent in bit3, bit4 and bit5 and meanwhile MISO is clocking in the "current channel". So, the out buffer is set up in this manner;
1. Allocate a DMA capable memory for the descriptor.
2. Clear all bits
3. 20 bytes, 16 bytes for the 8 channel addresses and then 4 bytes of "sync" where the CS (via timer) is pulled high.
4. no eof and next buffer is "myself"
5. Set up the channels to be read.
6. 255 in the 4 bytes so that they are easy to locate on the scope.

Code: Select all

   out = static_cast<lldesc_t *>(heap_caps_malloc(sizeof(lldesc_t), MALLOC_CAP_DMA));

   memset((void *) out, 0, sizeof(lldesc_t));
   out->size = 20;
   out->length = 20;
   out->offset = 0;
   out->sosf = 0;
   out->eof = 0;
   out->owner = 1;
   out->qe.stqe_next = out;
   out->buf = static_cast<uint8_t *>(heap_caps_malloc(20, MALLOC_CAP_DMA));
   out->buf[0] = 1 << 3;
   out->buf[1] = 0;
   out->buf[2] = 2 << 3;
   out->buf[3] = 0;
   out->buf[4] = 3 << 3;
   out->buf[5] = 0;
   out->buf[6] = 4 << 3;
   out->buf[7] = 0;
   out->buf[8] = 5 << 3;
   out->buf[9] = 0;
   out->buf[10] = 6 << 3;
   out->buf[11] = 0;
   out->buf[12] = 7 << 3;
   out->buf[13] = 0;
   out->buf[14] = 0 << 3;
   out->buf[15] = 0;
   out->buf[16] = 255;
   out->buf[17] = 255;
   out->buf[18] = 255;
   out->buf[19] = 255;
This works well, and I do the same in the DAC without issue.

Reading descriptor is set up in a similar fashion;

Code: Select all

   in = static_cast<lldesc_t *>(heap_caps_malloc(sizeof(lldesc_t), MALLOC_CAP_DMA));
   memset((void *) in, 0, sizeof(lldesc_t));
   in->size = 20;
   in->length = 20;
   in->offset = 0;
   in->sosf = 0;
   in->eof = 0;
   in->owner = 1;
   in->qe.stqe_next = in;
   in->buf = static_cast<uint8_t *>(heap_caps_malloc(20, MALLOC_CAP_DMA));
Then there is a bunch of code to set up DMA and SPI peripherals, as well as phasing the timer into exact clock. The SPI is running on 10MHz and the Timer is at 20MHz so I can adjust half phase of the CS vs SCLK


Now, On the oscilloscope, every single bit looks fine. I have checked all timing I can think of, with the scope. It all looks good externally.

HOWEVER, the buffer that is filled is having 2 major problems;

1. It starts off with the 5th value in the beginning of the in-buffer. Then 6th, 7th, 8th, followed by the 4 bytes reading during the CS-HIGH, then comes 1st, 2nd, 3rd and finally 4th value in the buffer. The 65534 is also a bit mysterious, and could be a clue.
Ex (16 bit values in buffer order); 1, 0, 0, 0, 65535, 65534, 1717, 1720, 858, 1717

2. After some time (minutes) everything in the in-buffer looks like mad, but there is no difference on the scope.
Ex; 768, 256, 0, 3, 65535, 65030, 46598, 47363, 23046, 46336
It definitely looks like "a byte off"

3. Then some more time later, we are "back", but we are now another byte off;
Ex; 1, 0, 0, 65535, 65534, 1719, 1722, 857, 1717, 2


So, my questions to Espressif engineers and anyone with DEEP understanding of how this works;

1. How can the READ function start (in relation to the WRITE operations) in the "wrong place"?

2. How can the descriptor sequence "lose" a byte and continue to do so every so seldom (but detrimentally in my application)?


Attached are scope photos, one for close up on start of descriptor, one at the "end" where the 255s are written out and one that shows a complete loop.
Pink/top = MISO
Cyan/2nd = SCLK
Yellow/ 3rd = MOSI
Blue / bottom = CS

Thanks in Advance
Niclas
Attachments
IMG_20200315_205353.jpg
IMG_20200315_205353.jpg (431.87 KiB) Viewed 9045 times
IMG_20200315_205411.jpg
IMG_20200315_205411.jpg (433.74 KiB) Viewed 9045 times
IMG_20200315_205450.jpg
IMG_20200315_205450.jpg (713.21 KiB) Viewed 9045 times

ESP_michael
Posts: 37
Joined: Mon Aug 28, 2017 10:25 am

Re: Endless SPI w/ DMA receive problem

Postby ESP_michael » Mon Mar 23, 2020 2:19 am

Hi NiclasH,

Could you explain how each signals are controlled? for example

CS: by SPI_MASTER or PWM
CLK: by SPI_MASTER or PWM

and do you use SPI_MASTER or SPI_SLAVE?

As far as I know, our SPI_MASTER doesn't provide any feature to "sync" the CS with an signal. It's always triggered by the SW. For SPI_SLAVE, it can respond to the CS signal, but there are quite a few timing and DMA controll issue to solve.

ESP_Alvin
Posts: 211
Joined: Thu May 17, 2018 2:26 am

Re: [IDFGH-2892] Endless SPI w/ DMA receive problem

Postby ESP_Alvin » Mon Mar 23, 2020 3:32 am

Moderator's note: update the topic title for issue tracking. Thanks.

NiclasH
Posts: 8
Joined: Mon Dec 28, 2015 4:34 am

Re: [IDFGH-2892] Endless SPI w/ DMA receive problem

Postby NiclasH » Wed Apr 22, 2020 10:44 am

Thanks for taking an interest. Between my post and "me giving up", I gave up for a different reason... More about that below. First answer your questions;

SCLK is generated by the SPI module.

CS is generated from MCPWM. MCPWM resolution is 4 clk cycles per SCLK cycle, so I can set the CS pulse exactly right, together with the "sync" feature in MCPWM. In the code below, that is the critical section that sets this up. The "505" number below is the needed number of MCPWM pulses for the first count to get in sync (written on last line inside "for" block)

I loop twice, to ensure that a code cache miss doesn't occur and that the timing during the disable of interrupts is exact.

Problem 1; As described above, after some number of seconds/minutes (don't remember) there is a 8 bit shift in the DMA or MCPWM. Since it is 8 bits, I doubt that it is the MCPWM so I assume it is a DMA thing. I overcame that in the software, by looking for the 0xFF shifted in during CS(high) inside the DMA buffer and accepted that as a solution and moved on.

Problem 2. When ANY OTHER CODE was modified, such as the Arduino sketch on top (executing on the other core), it could happen that the timing/sync (the "505") had to be modified A LOT. It could jump several hundred clocks. The larger the change, the bigger the probability of the "out-of-sync", but even ridiculously small change (one line) could trigger that I search for a new sync value.

I have no clue exactly what happens or why, but I suspect that I can't make the ESP32 Core Execution timing to be reliable, even for a few dozen lines of code. If you have an answer on this, then I am all ears.

Code: Select all

   // Values to be written during time critical stage
   auto s1 = (1 << MCPWM_TIMER1_MOD_S) | (2 << MCPWM_TIMER1_START_S);
   auto s3 = (505 << MCPWM_TIMER1_PHASE_S) | (0 << MCPWM_TIMER1_SYNCO_SEL) | (1 << MCPWM_TIMER1_SYNC_SW_S);

   portDISABLE_INTERRUPTS();  // No interference in timing.

   esp_err_t error = aaa_spi_prepare_circular(VSPI_HOST, 2, out, in, 10000000, mosi_pin, miso_pin, sclk_pin, 0);
   ESP_ERROR_CHECK(error)

   spi_dev_t *const spiHw = aaa_spi_get_hw_for_host(VSPI_HOST);

   // this bit of code makes sure both timers and SPI transfer are started as close together as possible
   for (int i = 0; i < 5; i+=2) {  // Make sure SPI Flash fetches doesn't interfere


      // Stop and reset timer
      WRITE_PERI_REG(MCPWM_TIMER1_CFG1_REG(0), (2 << MCPWM_TIMER1_MOD_S) | (0 << MCPWM_TIMER1_START_S)); // decrement mode, stop timer 1 on TEZ

      //Reset DMA
      spiHw->dma_out_link.start = 0;
      spiHw->dma_in_link.start = 0;
      spiHw->dma_out_link.stop = 1;
      spiHw->dma_in_link.stop = 1;
      spiHw->cmd.usr = 0;   // SPI: Stop SPI DMA transfer
      spiHw->cmd.val = 0;

      // Set up timer...
      initialize(cs_pin);

      // --- sync to known prescaled cycle. 
      uint32_t waitUntil = READ_PERI_REG(MCPWM_TIMER1_STATUS_REG(0));
      while (waitUntil != READ_PERI_REG(MCPWM_TIMER1_STATUS_REG(0)));


      spiHw->dma_conf.val |= SPI_OUT_RST | SPI_IN_RST | SPI_AHBM_RST | SPI_AHBM_FIFO_RST;
      spiHw->dma_conf.val &= ~(SPI_OUT_RST | SPI_IN_RST | SPI_AHBM_RST | SPI_AHBM_FIFO_RST);
//      spiHw->dma_conf.out_data_burst_en = 0;

      spiHw->dma_out_link.start = 1;
      spiHw->dma_in_link.start = 1;
      spiHw->dma_out_link.restart = 1;   // Start SPI DMA transfer for MOSI
      spiHw->dma_in_link.restart = 1;   // Start SPI DMA transfer for MISO
      WRITE_PERI_REG(MCPWM_TIMER1_CFG1_REG(0), s1); // start timer 1
      spiHw->cmd.usr = 1; // start SPI transfer
      WRITE_PERI_REG(MCPWM_TIMER1_SYNC_REG(0), s3);
   }

   portENABLE_INTERRUPTS();

NiclasH
Posts: 8
Joined: Mon Dec 28, 2015 4:34 am

Re: [IDFGH-2892] Endless SPI w/ DMA receive problem

Postby NiclasH » Sun Apr 26, 2020 5:02 am

Second problem found; It was a ESP_LOGI inside the portDISABLE_INTERRUPTS block. Newer esp-idf/ detected that and pointed me in the right direction. Original problem; I can live with my work around.

NiclasH
Posts: 8
Joined: Mon Dec 28, 2015 4:34 am

Re: [IDFGH-2892] Endless SPI w/ DMA receive problem

Postby NiclasH » Sat May 02, 2020 6:05 am

Nope. The compiler/linking still affects the delays that happens inside the silicon somewhere... We have decided to abandon the hardware solution at hand and go with something else, where we don't depend on reliable CPU execution cycles.

Who is online

Users browsing this forum: Bing [Bot] and 87 guests