Page 1 of 1

Callback does not return when uses object being used n the other core

Posted: Tue Feb 13, 2024 11:21 am
by lukilukeskywalker
Not sure if it belongs to this Forum (ESP-IDF) I just want to understand what is happening internally
So I have an object whose class provides two methods one FPGA_handler() which takes a few variables, makes some computations, orders them as a bitstream and sends them via a spi channel. This method is being called constantly, and works inside CPU 1
The other method is a method that handles coms over serial and CAN.
Since a few months I have been refactoring the code, as the main developer left. Most of the stuff didn't work, neither was CAN in a working condition. (This is just to state, that I don't know if his implementation would have even worked at all)
Last week I refactored CAN coms to use esp_event, to run different callbacks depending on the ID in the can bus. Some callbacks receive in the void *args a object pointer, and inside the callback, methods of the object get called to configure some variables of the object (The same variables used inside FPGA_handler() ) These variables are not protected via a mux/semaphore or anything alike (So, yeah, I guess I should start by protecting the access to these variables)
What is interesting is that when the callback is called, the functions get executed without any problem, but at the moment when the callback has to return, it does never return. Actually the task stops working completely. I am not sure if the core gets halted
Why does this happen? Why does the callback never return? What happens when two cores try to access the same data?

Re: Callback does not return when uses object being used n the other core

Posted: Tue Feb 13, 2024 5:38 pm
by MicroController
Why does this happen? Why does the callback never return?
May be some memory corruption caused by a race condition. Or something else completely.
What happens when two cores try to access the same data?
The hardware doesn't care, it will happily synchronize individual RAM accesses across cores without you even noticing.
However, what the CPU considers a single RAM access may be different from what looks like a single access in your code. Plus, RAM accesses may not happen in the assembler code exactly when you expect them to happen from looking at your program, or even not at all.

At the very least, you must make sure that shared variables are declared volatile. This can do the trick in some limited cases.
Protecting shared data via mutex or critical section is the safe/correct way.
Alternatively, you can remove the concurrent accesses to shared data, have one context/task 'own' the data, and send messages to have data updated by and in-sync with the owning task's processing.

#1000