Duet sometimes really slow? - I2C error or?
-
That's interesting.
I tried to reproduce it yesterday, using separate ground wires from the PSU to the Duet and DueX, three Nema 23 motors running at 2A moving continuously driven from the Duet, and a diode connected between Fan0- and E2 endstop stop pin, set to 10Hz frequency, so that the E2 endstop input toggles at 10Hz in order to force the Duet to read the status via I2C frequently. The first time I tried this, I got an I2C lockup after 9 minutes. But I was not able to reproduce it again.
I have reviewed the I2C code and made some changes. In particular, when an I2C error is detected, I now reset the I2C controller on the Duet and retry. Up to 3 tries are done. Each I2C reset is recorded and the reset count is included in the M122 log along with the other I2C stats. Also I have made a change that I planned a long time ago, which is to use a separate RTOS task to monitor the DueX5 state and do the I2C transactions when it changes. This should substantially reduce the latency of the endstop inputs on the DueX.
I intend to release this in firmware 2.03RC2 later today. Ian, would you be able to upgrade to this release and see if you are able to reproduce the problem on it?
-
@dc42 said in Duet sometimes really slow? - I2C error or?:
I intend to release this in firmware 2.03RC2 later today. Ian, would you be able to upgrade to this release and see if you are able to reproduce the problem on it?
Yes of course.
Interestingly, I repeated the exact same sequence for the 4th day running and got exactly the same result. That is to say, pauses between moves that occurred in exactly the part of the second home all sequence. As before an M122 showed I2C errors which were cleared with a subsequent M999. I have no idea why I can't reproduce this unless the printer has been powered down over night. I've switched the printer off and will try again later today to see if a 1 hour or 2 hour power down will provoke it.
That's 4 days running that I've had exactly the same thing happen, so at least it looks like we are getting close to having a method that will provoke the problem with some degree of confidence, even if we don't know how or what part of the sequence is responsible.
This kind of reminds of a problem I had in a previous life with a customer who had a V12 E type Jaguar. This thing would break down but only when he had driven from Luton to the 7 bridge on the Welsh border and stopped to pay the toll. I got there in the end but it was a bitch to diagnose.
-
@dc42 David, can you pm or email me when you have the release ready so that I don't miss it.
-
@deckingman said in Duet sometimes really slow? - I2C error or?:
I have no idea why I can't reproduce this unless the printer has been powered down over night. I've switched the > printer off and will try again later today to see if a 1 hour or 2 hour power down will provoke it.
When it happens to me 4/5 times it is upon power on, and allready on the first homing move it is pausing between movement. 1/5 of the time it happens mid print/mid sequence.
-
I've just released 2.03RC2 at https://github.com/dc42/RepRapFirmware/releases/tag/2.03RC2.
-
OK. Just tried it and had no problem but...........
-
The printer had only been off for about 3 to 4 hours rather than over night. I'll test again tomorrow morning.
-
More importantly, I was using M574 to re-map end stops for the upper load balancing gantry. I ran the exact same home all macro, but it crashed the upper gantry and obviously the behaviour is different now that re-mapping end stops has been withdrawn. So I can't completely replicate the sequence of events that had proven to provoke the problem repeatedly over 4 days.
To run a print like this, I'd need to revert to a configuration that doesn't use the upper gantry or switch to RRF 3.0. Please advise.
Cheers
-
-
@deckingman, thanks for trying it. RRF3 doesn't yet incorporate these changes, and won't until sometime later this week or next week. So for now, reverting your configuration is the only option.
-
@dc42 OK. In the interests of keeping everything as consistent as possible, I'll just slip the belts off the upper motors. Then I won't be introducing other variable as might be the case if I change the configuration. We still have the situation where the sequence of events that provoked the problem won't be quite the same, so it introduces an element of doubt as to the effectiveness of the solution.
For info, I ran M122 and noticed this which is new:
Tasks: NETWORK(ready,660) HEAT(blocked,1236) DUEX(suspended,156) MAIN(running,4264) IDLE(ready,160)
Is there anything else you'd like me to check?
-
There is a new field "resets" in the I2C stats line.
-
@dc42 Yup. That was reported as zero in this instance. I'll try it (the sequence) again tomorrow morning.
-
I will update to this release tonight, and will report back. The behavior of the homing sequence is identical to my issue.
-
I ran "the sequence" again today with this new RC firmware with no I2C issues but......
-
The sequence isn't quite the same because I can't home my 3rd gantry (the homeall file is the same but the end stop mapping is dissabled in this firmware). I've just taken the belts off the upper gantry motor pulleys.
-
During the first homing sequence I had a report - "Error: over temperature shutdown reported by driver(s) 8." I've never seen this before and I don't think it's "real". Driver 8 is the 3rd extruder drive so that motor hadn't even been energised.
For info, running M122 after "the sequence" shows no I2C errors and resets as being zero.
HTH and unless I hear otherwise, I run it again tomorrow without making any changes.
-
-
Quick update. I had to print something in a hurry. The print went well, no sign of any over temperature errors so hopefully that was a one off, unexplainable glitch. I wasn't expecting to have any I2C errors and that was what happened. M122 shows no I2C resets either. The printer is now going into it's over night hibernation ready for tomorrow's home, extrude, home sequence.
-
@deckingman, thanks for the update.
To be clear, what I would like to establish is:
- Whether the original fault is fixed (with or without a nonzero i2c reset coun). The original fault is that i2c communication breaks down. The symptom of the printer going slow was a side effect, which I don't expect to be present in the new release even if i2c comms does break down. To test whether i2c communication is working, try changing the speed of a fan connected to the DueX.
- Whether or not the new code that fetches the states of the DueX endstop inputs is reliable. So after a long print, please use M119 or the Machine Properties page of the old DWC to read the endstop states as you operate the switches.
-
@dc42 I didn't get a chance to run my tests last night, will try tonight. I can generally get it to start going slowly during the calibration process of leveling out the nozzles. A lot of repetitive moves and tool changes positioning each tool at various points on the bed and checking for z equivalency. It's manual, and fairly time consuming, hence I don't know of a way to automate an i2c issue I was hitting.
-
@dc42 said in Duet sometimes really slow? - I2C error or?:
@deckingman, thanks for the update.
To be clear, what I would like to establish is:
- Whether the original fault is fixed (with or without a nonzero i2c reset coun). The original fault is that i2c communication breaks down. The symptom of the printer going slow was a side effect, which I don't expect to be present in the new release even if i2c comms does break down. To test whether i2c communication is working, try changing the speed of a fan connected to the DueX.
- Whether or not the new code that fetches the states of the DueX endstop inputs is reliable. So after a long print, please use M119 or the Machine Properties page of the old DWC to read the endstop states as you operate the switches.
Errr, that might be tad difficult.
The fans that are connected to the Duex are all thermostatic and used to cool the steppers. However I can get round that because I just changed the mounts to aluminium ones so I won't need to cool the motors. I deliberately left all the fans, thermistors and wiring as was in the interest of consistency so it'll be an easy thing to either play around with the thermostatic thresholds or disable thermostatic mode and manually change fan speeds.
End stops might be a bit more challenging. The first 2 (E2 and E3) on the Duex5 are connected to the upper gantry. These only get mapped to X and Y when homing so don't normally show up, and with this firmware I can't map them to X and Y either because M574 is non-functional. The second two (E4 and E5) are used with axes maxima and trigger emergency stop using M574 (which then of course requires a firmware reset).
I don't mind temporarily changing my config to test but please advise what and how.
Cheers
-
@dc42 I'm getting somewhere -- prior to this release it took a lot of work to get it to break. Now I try to home axis W and it causes the board to reset -- no error -- so I'm going to replace that wire. I see it does toggle properly when I trigger the end stop, but it will not home. Axis A homes, so that's the next one on i2c bus, so it must be interference or something in this specific wire run. Will update after replacing it
EDIT: replaced the wire -- now homes. Will do the calibration tomorrow night. This looks like interference from stepper wiring, I had this end stop wire in the same nylon sleeve as stepper wires, I've had this problem with other wires before, and sleeving the stepper wires by themselves apart from the end stop solved it. Will post an update tomorrow how calibration goes -- I didn't catch any i2c errors during this troubleshooting, but something did trigger duet to reboot.
-
I have not been able to reproduce the issue with 2.01 since the last time I posted (without touching the hardware), it seems I was just really lucky to get it (or unlucky based on your point of view...). I used to get it a lot and very easily, not so much now.
Since then I tried the 2.03RC2 release, and haven't seen an I2C reset yet. Still keeping an eye on it. I don't have any endstops on the Duex but all my fans are on it due to the 12V regulator.
@deckingman I believe if the Duex communication fails, the fans on the Duex will simply never turn off, even when below the temperature threshold (this is what happened for me, the hotend fan was stuck on even at room temp). So you should be able to confirm this without changing your config.
-
@fulg said in Duet sometimes really slow? - I2C error or?:
.......................@deckingman I believe if the Duex communication fails, the fans on the Duex will simply never turn off, even when below the temperature threshold (this is what happened for me, the hotend fan was stuck on even at room temp). So you should be able to confirm this without changing your config.
Yes that's fine but what I can't test is David's second point which is quote:
"Whether or not the new code that fetches the states of the DueX endstop inputs is reliable."
Because, the implemented changes are based on a version of firmware that does not allow me to use those end stops, (at least in the way that my machine is configured) because support to map end stops to axes has been withdrawn.
@DC42 David. Is there any way that we could have a version of firmware to test which has both the changes you have made to the I2C code, and which also reinstates the M574 mapping of end stops to axes? I am concerned that essentially two things have changed. Firstly, you have made changes to the I2C related code but secondly, I am no longer to use the exact same test sequence which had proven to provoke the issue on 4 consecutive occasions. So as my test methodology is no longer the same (because I cannot use the Duex end stops in the same way), then it reduce the level of confidence one might have that the changes to the code are an effective to solution to this somewhat elusive problem.
-
Ian, you can test the status of the endstops on the DueX by creating additional axes and making them visible. If you create 3 axes then the third one will use the E2 endstop input on the Due X. When that axis is visible, the state of the E2 endstop will show up in the Machine Properties page of the old DWC, and in the M119 response.