CAn controller
CAn controller
The Can Controller for the ESP32 works great,
Only when i want to send a couple of messages to CAN, it gives me a Warning and eventually BUS OFF when i do it with multiple nodes on the CAN BUs
When i only have 1 other node i communicate nothing is wrong.
And when i get in BUS-OFF i restart the CAn Controller and it says there is no more error only i cant communicate with my other node.
The setting stayed the same.
Or is there another way to restart the CAN controller
Only when i want to send a couple of messages to CAN, it gives me a Warning and eventually BUS OFF when i do it with multiple nodes on the CAN BUs
When i only have 1 other node i communicate nothing is wrong.
And when i get in BUS-OFF i restart the CAn Controller and it says there is no more error only i cant communicate with my other node.
The setting stayed the same.
Or is there another way to restart the CAN controller
Re: CAn controller
Hi,
I'm facing similar problems with bus-off.
Only workaround I have found till now is to read TX error counter an if it exceed half of max value, then start CPU based bus-off recovery before it actually enter real bus-off.
In attachment there is my modified CAN.c - try it and let me know if it works for you.
I'm facing similar problems with bus-off.
Only workaround I have found till now is to read TX error counter an if it exceed half of max value, then start CPU based bus-off recovery before it actually enter real bus-off.
In attachment there is my modified CAN.c - try it and let me know if it works for you.
- Attachments
-
- CAN.c
- (8.62 KiB) Downloaded 1236 times
Re: CAn controller
Interesting. Ignore tx_error_count below as unused, but this is the output of a task that prints these values whenever TXERR.B (transmit error counter) has changed. It missed some samples as I had to put in a vTaskDelay(1) at the end of its loop to keep the watchdog fed, but I see SR.ES changing to 1 when TXERR.B reaches 96, and then reverts to 0 when it drops below 96.
This was in response to sending a CAN frame without another device on the bus, then connecting the other device and sending further frames.
Trying to work out what scenarios would produce a bus off as it seems that the transmit retries. Thinking about how to handle an always on device that may sometimes have nothing else on the CAN bus or a sudden power down of another device on the CAN bus.
TXERR.B 64, tx_error_count 0, SR.BS 0, SR.ES0
TXERR.B 96, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 128, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 126, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 124, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 122, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 120, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 119, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 118, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 116, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 114, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 112, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 110, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 108, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 106, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 105, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 102, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 100, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 98, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 96, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 94, tx_error_count 0, SR.BS 0, SR.ES0
This was in response to sending a CAN frame without another device on the bus, then connecting the other device and sending further frames.
Trying to work out what scenarios would produce a bus off as it seems that the transmit retries. Thinking about how to handle an always on device that may sometimes have nothing else on the CAN bus or a sudden power down of another device on the CAN bus.
TXERR.B 64, tx_error_count 0, SR.BS 0, SR.ES0
TXERR.B 96, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 128, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 126, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 124, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 122, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 120, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 119, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 118, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 116, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 114, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 112, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 110, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 108, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 106, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 105, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 102, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 100, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 98, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 96, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 94, tx_error_count 0, SR.BS 0, SR.ES0
-
- Posts: 22
- Joined: Fri Mar 02, 2018 3:24 pm
Re: CAn controller
Hi Timons,
first of all, bus off condition should not be reached under normal circumstances. It might be caused by some hardware issue, like invalid bus termination, or you might have multiple nodes trying to arbitrate the bus using the same message id at a time.
I'd recommend to look at the SJA1000 data sheet closely. It has a builtin error recovery procedure that meets CAN specification. It would never recover, if there were lots of dominant bits on the bus...
To be able to recover/retry under any circumstances - like open bus, no termination, faulty nodes, broken cables,... - i've used the following procedure without issues up to now:
1. wait long time (a second)
2. enter reset mode manually (MODULE_CAN->MOD.B.RM = 1;)
3. uninstall isr (esp_intr_free( CAN_cfg.intr_handle);)
4. wait long time (a second second)
5. initialise again (CAN_init();)
Best,
Markus
first of all, bus off condition should not be reached under normal circumstances. It might be caused by some hardware issue, like invalid bus termination, or you might have multiple nodes trying to arbitrate the bus using the same message id at a time.
I'd recommend to look at the SJA1000 data sheet closely. It has a builtin error recovery procedure that meets CAN specification. It would never recover, if there were lots of dominant bits on the bus...
To be able to recover/retry under any circumstances - like open bus, no termination, faulty nodes, broken cables,... - i've used the following procedure without issues up to now:
1. wait long time (a second)
2. enter reset mode manually (MODULE_CAN->MOD.B.RM = 1;)
3. uninstall isr (esp_intr_free( CAN_cfg.intr_handle);)
4. wait long time (a second second)
5. initialise again (CAN_init();)
Best,
Markus
Re: CAn controller
Markus, wondering from my trace whether retransmission stops when it goes error passive with TXERR.B 128, since it does not go higher and when I reconnect the second node, the message is then received, so it does not reach bus off. I could not see from SJA1000 manual what would happen with retransmission (except in single shot mode). So far I am thinking of not recovering from bus off and reporting an error only.
Re: CAn controller
Kvaser explain it:
Q: What happens if a node is alone on the bus and tries to transmit?
A: The node will, of course, win the arbitration and happily proceeds with the message transmission. But when the time comes for acknowledging… no node will send a dominant bit during the ACK slot, so the transmitter will sense an ACK error, send an error flag, increase its transmit error counter by 8 and start a retransmission. This will happen 16 times; then the transmitter will go error passive. By an special rule in the error confinement algorithm, the transmit error counter is not further increased if the node is error passive and the error is an ACK error. So the node will continue to transmit forever, at least until someone acknowledges the message.
Q: What happens if a node is alone on the bus and tries to transmit?
A: The node will, of course, win the arbitration and happily proceeds with the message transmission. But when the time comes for acknowledging… no node will send a dominant bit during the ACK slot, so the transmitter will sense an ACK error, send an error flag, increase its transmit error counter by 8 and start a retransmission. This will happen 16 times; then the transmitter will go error passive. By an special rule in the error confinement algorithm, the transmit error counter is not further increased if the node is error passive and the error is an ACK error. So the node will continue to transmit forever, at least until someone acknowledges the message.
-
- Posts: 22
- Joined: Fri Mar 02, 2018 3:24 pm
Re: CAn controller
Hi jcsbanks,
exactly. For the simple case (node disconnected) the controller will enter passive error state and keep transmitting the frame without incrementing tx error counter until the frame gets acked by a node. In other cases, on a massively disturbed bus, bus-off will be reached.
Anyway, a frame sitting in the controller for a long time will likely become useless after some time (from the app point of view). And then, the node(s) that finally ack the frame might not include the node that should get that frame (could be still rebooting).
Many networks get perfectly usable again after some time, as power supply stabilizes after an overload for example.
This is why i think, an always on device should recover even from bus-off after some (longer) time. Can specification requires (and i think the SJA1000 ensures) 128 * 11 bits pause, but I would recommend a much longer time. Then, if successful communication to restart count ratio gets too low, a good can node could stop can and report "CAN ready"
Markus
exactly. For the simple case (node disconnected) the controller will enter passive error state and keep transmitting the frame without incrementing tx error counter until the frame gets acked by a node. In other cases, on a massively disturbed bus, bus-off will be reached.
Anyway, a frame sitting in the controller for a long time will likely become useless after some time (from the app point of view). And then, the node(s) that finally ack the frame might not include the node that should get that frame (could be still rebooting).
Many networks get perfectly usable again after some time, as power supply stabilizes after an overload for example.
This is why i think, an always on device should recover even from bus-off after some (longer) time. Can specification requires (and i think the SJA1000 ensures) 128 * 11 bits pause, but I would recommend a much longer time. Then, if successful communication to restart count ratio gets too low, a good can node could stop can and report "CAN ready"
Markus
Re: CAn controller
Good advice, thanks Markus.
-
- Posts: 7
- Joined: Tue Aug 08, 2017 12:28 pm
Re: CAn controller
After I restart the ESP32 (esp_restart()), the CAN controller interrupt (CAN_isr()) no longer works. I'm using the CAN.c that Thomas Barth made, similar to the one spintec attached in this post. The CAN works fine from a power-on start or reset from external button. Just not from a commanded restart.
Any of you have similar problems? What did you do to fix?
I recently updated to IDF 3.0 from IDF 2.1. When I was using IDF 2.1, the CAN isr worked after the esp_restart(). So, I'm assuming it is something that has changed in 3.0 that affects the CAN shutdown/restart/initialize.
Any of you have similar problems? What did you do to fix?
I recently updated to IDF 3.0 from IDF 2.1. When I was using IDF 2.1, the CAN isr worked after the esp_restart(). So, I'm assuming it is something that has changed in 3.0 that affects the CAN shutdown/restart/initialize.
-
- Posts: 7
- Joined: Tue Aug 08, 2017 12:28 pm
Re: CAn controller
I got it working now. I'm not exactly sure what the problem was, but here is what I did...
I went back and re-ran the basic CAN demo, adding periodic esp_restart(). I saw it worked fine with IDF 3.0. So, I figured it was something in my higher level code that wasn't re-initializing correctly.
I added some guards in my main app (not the basic demo) so that
- CAN_write_frame wouldn't be called after a CAN_stop was issued (before esp_restart)
- CAN_write_frame wouldn't be called until after a CAN_init was issued (after esp_restart)
Then I also added called CAN_stop before calling esp_restart (this wasn't needed to get a basic CAN demo working with periodic esp_restart, but figured it would make it more graceful).
I went back and re-ran the basic CAN demo, adding periodic esp_restart(). I saw it worked fine with IDF 3.0. So, I figured it was something in my higher level code that wasn't re-initializing correctly.
I added some guards in my main app (not the basic demo) so that
- CAN_write_frame wouldn't be called after a CAN_stop was issued (before esp_restart)
- CAN_write_frame wouldn't be called until after a CAN_init was issued (after esp_restart)
Then I also added called CAN_stop before calling esp_restart (this wasn't needed to get a basic CAN demo working with periodic esp_restart, but figured it would make it more graceful).
Who is online
Users browsing this forum: No registered users and 42 guests