Driver Error and MCU overheating
-
Hi,
I hope this is the right subforum for my questions.There seems to be a problem in my configuration, because I am struggling with MCU overheating and Driver Errors, such as "Error: Driver 0.0 error: phase A short to Vin", which do not really make sense.
I am using high-quality Steppers from IGUS and I got this issue for multiple axes, but maily for 0.0. I am not operating a standard 3d printer/CNC, but another system. In my case, the 0.0 stepper is just spinning a disc with nothing attached to it, no friction or similar, just inertia. So, there is absolutely no strain on it.
This error mostly occurs together with the MCU heating up, which I am not sure why it does that. I reach temperatures of 45-50 degrees easily, before it resets or shows an overheating error (EDIT: I just read that the MCU can withstand 85°C? But I get overheating errors at 50 already, is that normal?) I cannot explain why it gets so hot, because the steppers are not even working most of the time. This happens even when it is just sitting there or just spinning the disc around. The currents should never really exceed 1A. Unfortunately, due to NDA I cannot disclose any details on my setup here in the forum, we'd need to move to Email support for that. My suspicion was that somehow the idle currents are very high, but I have tried to set everything properly and reduce the idle current. I operated the Duet from a lab supply for a bit and checked, and the currents do not seem very high at all, like 300mA during Idle. Also, the motors are all cold.
Everything worked perfectly previously, and then suddenly this issue occured. The wiring is correct, the pins are correct, the internal resistance of both stepper circuits are the same.Here is an extract from my config.g:
; Smart Drivers M569 P0 S1 D3 ; driver 0.0 goes backward (X axis) M569 P1 S1 D3 ; driver 0.1 goes forwards (Y axis) M569 P2 S1 D3 ; driver 0.2 goes forwards (Z axis) M569 P3 S1 D3 ; driver 0.3 goes forwards (Z axis) ; Axes M584 X0.0 Y0.1 Z0.2:0.3 ; set axis mapping M350 X8 Y1 Z2 I1 ; configure microstepping with interpolation M906 X1000 Y1200 Z1300 I5 T3; set axis driver currents M917 X5 Y5 Z5; standstill current reduction M92 X80 Y40 Z80 ; configure steps per mm M208 X0:200 Y0:190 Z0:60 ; set minimum and maximum axis limits M566 X900 Y900 Z900 ; set maximum instantaneous speed changes (mm/min) M203 X1000 Y1000 Z2000 ; set maximum speeds (mm/min) M201 X500 Y75 Z40 ; set accelerations (mm/s^2) ; Kinematics M669 K0 ; configure Cartesian kinematics ; Probes M558 K0 P5 C"io4.in" H15 F1500 T600 ; configure unfiltered digital probe via slot #0 G31 K0 P10 X0 Y0 Z0 ; set Z probe trigger value, offset and trigger height ; Endstops M574 X1 P"!io0.in" S1 ; configure X axis endstop M574 Y1 P"io1.in" S1 ; configure Y axis endstop M574 Z2 P"io3.in+io2.in" S1 ; configure Z axis endstop
And here is the M122 report:
M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.4.0 (2022-03-15 18:59:15) running on Duet 3 Mini5plus Ethernet (standalone mode) Board ID: 09WBW-1S8LU-F65J0-409NY-3013Z-HL3RU Used output buffers: 3 of 40 (27 max) === RTOS === Static ram: 103684 Dynamic ram: 106056 of which 0 recycled Never used RAM 29044, free system stack 154 words Tasks: NETWORK(ready,25.5%,218) ETHERNET(notifyWait,0.2%,568) HEAT(notifyWait,0.0%,358) Move(notifyWait,0.0%,292) CanReceiv(notifyWait,0.0%,942) CanSender(notifyWait,0.0%,356) CanClock(delaying,0.0%,339) TMC(notifyWait,0.6%,105) MAIN(running,72.8%,511) IDLE(ready,0.0%,29) AIN(delaying,0.8%,264), total 100.0% Owned mutexes: === Platform === Last reset 00:25:05 ago, cause: software Last software reset at 2025-02-12 14:07, reason: User, GCodes spinning, available RAM 28840, slot 0 Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00461000 BFAR 0xe000ed38 SP 0x00000000 Task MAIN Freestk 0 n/a Error status: 0x00 MCU revision 3, ADC conversions started 1506042, completed 1506042, timed out 0, errs 0 Step timer max interval 1488 MCU temperature: min 46.8, current 47.2, max 49.9 Supply voltage: min 23.8, current 23.9, max 23.9, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/0, heap memory allocated/used/recyclable 2048/24/24, gc cycles 0 Events: 0 queued, 0 completed Driver 0: standstill, SG min 2, read errors 0, write errors 0, ifcnt 47, reads 62406, writes 1, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 0, read errors 0, write errors 0, ifcnt 41, reads 62407, writes 1, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 2, read errors 0, write errors 0, ifcnt 40, reads 62406, writes 1, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 12, read errors 0, write errors 0, ifcnt 40, reads 62406, writes 1, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 0, read errors 0, write errors 0, ifcnt 29, reads 62407, writes 0, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2025-02-12 14:32:10 Cache data hit count 2623384019 Slowest loop: 8.45ms; fastest: 0.13ms === Storage === Free file entries: 10 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 3.9ms, write time 0.0ms, max retries 0 === Move === DMs created 83, segments created 3, maxWait 0ms, bed compensation in use: none, comp offset 0.000 === MainDDARing === Scheduled moves 9, completed 9, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters -1 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 === GCodes === Segments left: 0 Movement lock held by null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty === CAN === Messages queued 10672, received 0, lost 0, boc 0 Longest wait 0ms for reply type 0, peak Tx sync delay 0, free buffers 18 (min 18), ts 5929/0/0 Tx timeouts 0,0,5929,0,0,4743 last cancelled message type 30 dest 127 === Network === Slowest loop: 11.63ms; fastest: 0.03ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions HTTP sessions: 1 of 8 - Ethernet - State: active Error counts: 0 0 0 0 0 Socket states: 5 2 2 2 2 2 0 2
As you can see the temperature is very high, and this is even though I had the board idling for about 15 minutes now, not touching anything. I even sent M18 to shut off the motors.
I would be glad for your help, trying to find a solution. I have the same machine built already with the same components, and no issues there.
Cheers
Tobi -
@TobiAsis what temperature is very high? 45-50 degrees is the standard running temp for the MCU on these boards
what overheat error are you actually seeing?short to ground is normally an issue with wiring
-
@jay_s_uk Thanks!
This is the MCU temperature. I assumed it would be too high, because I had an error, something like "Overheat error", around the 50C and it shut off. Unfortunately, I am not able to replicate it and I do not remember the precise wording.
However, I feel like at around 50C, it randomly reboots or re-runs the config.g (I can tell, because i have a homeall sequence there, and I hear some sound and then the homing sequence starts off. It is all quite mysterious). This is why I thought, it went into overheat error, and then rebooted itself.I am using a Mini 5 + btw, which according to a document on the duet docs (I cannot post the link because i need " two reputations") might have trouble with temperature readings. I was not able to view the temperature of the motor controllers.
Wiring issue means that the wires itself are somehow broken? I can be sure that the pins are definetly correct.
-
I just tried to only spin the x-axis again and this happened immediately:
The only thing I can do is emergency stop, so I can even access anything behind all these error messages. The MCU temp is now 54C. The Duet was idling for about half an hour now. The Stepper driver temp is either 0 or 100C
Driver or MCU do not feel very hot, I can touch them with my hands, I would say maybe 30-40C
-
@TobiAsis Are you able to post details of the IGUS stepper motors?
-
@gloomyandy it should be this model https://www.igus.de/product/MOT-AN-S-060-002-042-L-A-AAAO
-
@TobiAsis can you post photos of your wiring at the duet 3 mini side?
-
Perhaps I found something interesting.
I turned the board off and let it cool down a bit. The MCU temperature was very low again. Upon turning on and homing everything, I ran a macro which spins my 0.0 stepper really slowly. To monitor the current, it placed my multimeter on the power in. What surprised me is that I saw a current draw of 1.4A, even with the motor spinning super slowly (F100). After just few seconds, I immediately go the driver overheat error again.
I then changed the speed to F500, ran it again and the current was just about 250mA. Now, I did not observe the overheating error.The question would now be: Why does it draw so much current when spinning very slowly? Is it a microstepping thing?
Also, the board is advertised with 1.4A, peak 2A. I am surprised that it already shuts off at 1.4A then.
-
-
@TobiAsis how do those colours on the motors correspond to the colours on the cabling they're supplied with?
-
@jay_s_uk I wired the colors to be exactly the same
-
@TobiAsis Did you study the Mini 5+ Hardware Overview?
In the section Cooling it states:
In many applications passive cooling will be sufficient, especially if the board is mounted vertically in a well ventilated position. If active cooling is needed then a fan blowing across the back of the board along the line of the stepper drivers is recommended. The stepper drivers' heatsinks are connected to the PCB and the majority of the heat is dissipated via the PCB so heatsinks on the stepper driver chips are largely ineffective.
Looking at your photo, I doubt that much heat can be dissipated from the PCB’s back. Better ventilation or active cooling may be required.
To the temperature readings you get: The two subsequent paragraphs in the above mentioned doc (”MCU Temp” and ”Stepper Driver Temp”) explain why RRF reports strange values.
-
@TobiAsis Usually stepper motors are rated at either 'peak' or 'RMS' current, unfortunately Igus have chosen to rate theirs at the 'nominal' current of 1.4A. I'm not sure what this means! I'd tend to assume this is their peak current, and you should run them at a maximum of 80% of that current, ie 1100mA. Running at peak current will heat up the motor, and possibly cause the other issues you were seeing with phase shorts.
Looking at the specification https://www.igus.eu/product/MOT-AN-S-060-002-042-L-A-AAAO?artnr=MOT-AN-S-060-002-042-L-A-AAAO these motors are 30.5mm long, with an inductance of 2mH, which is not high. I would think you don't need to run these with a high current.
For cooling see https://docs.duet3d.com/Duet3D_hardware/Duet_3_family/Duet_3_Mini_5+_Hardware_Overview#cooling
We don't recommend a specific current limit before cooling is required, because it depends on the application.Also note that full current is always drawn by the stepper driver, even if the motor is stationary. A stationary motor can heat up just as much, or even more, than a moving motor, as there's no movement to encourage air circulation. The stepper driver varies the voltage, not the current. The M906 I and T parameters only come into affect when the machine is idle, ie no motors moving.
For a deep dive into motors and RepRapFirmware, see https://docs.duet3d.com/en/User_manual/Connecting_hardware/Motors_choosing
Ian
-
@TobiAsis your config.g file shown that you are running the drivers in stealthChop mode. It's a feature of the TMC2209 and other drivers that stealthChop mode doesn't work properly at high speeds and can lead to excessive currents and resulting errors. So we advise only using stealthChop mode at very low speeds. There is a M569 parameter that lets you set the changeover speed.
-
@infiniteloop Thanks for the hint! Indeed I was not aware of this. You are right that the PCB is mounted in a way that there is very much restricted airflow in the back. I will make some design changes to allow some airflow in the back there. Do I understand correctly that essentially these areas are the heat dissipation zones?
@droftarts This is interesting. You mentioned that always full current is drawn when a motor is running. This sounds like, if I set the current to, lets say, 1A, and I run the motor at various speeds, it should always draw 1A until it goes into idle? I have noticed that the current draw is often significantly lower than what I have set.
I had some previous issues with a motor stalling, and when I measured, I realized that the current draw was only about half of the current I specified in config. I experimented with increasing the current, and I only reached a full current draw when I set it to 2x the rated peak current of the motor.
Is there any relationship between current draw, load, speed or microstepping? What could be the explanation to this?@dc42 Thanks. How would you define low speeds and high speeds? I see values between 100 and 4000 in the M569 description, but there is no unit. I had made the observation that there was excessive current draw at very LOW speeds, and that it normalized with higher speeds.
-
@TobiAsis said in Driver Error and MCU overheating:
Do I understand correctly that essentially these areas are the heat dissipation zones?
Heat dissipation occurs over a much larger area, but the marked spots are of central importance for this to work: The holes are copper-coated and transport the heat vertically through the multilayer circuit board.
-
@TobiAsis said in Driver Error and MCU overheating:
You mentioned that always full current is drawn when a motor is running.
Is there any relationship between current draw, load, speed or microstepping? What could be the explanation to this?Sorry, that's not entirely accurate, particularly as microstepping is generally used. It is accurate if you were using full stepping; at each full step, one (of the two) motor phases is at full current. However, with microstepping, between one full step of the stepper motor and the next, the voltage for a single phase varies in a sine wave from 100% to -100% and back to 100% (a transistor switches the voltage direction to reverse the polarity of the coil for the 'negative' part of the sine wave, switching magnetic direction of the coil). The voltage required for each step is set by V=I x R, where R is the coil resistance. So the current is varied to achieve the voltage required for that step, up to the maximum current limit. The stepper drivers are called 'constant current chopper drivers' because the input voltage (ie 24V) is 'chopped' (ie turned on and off very fast) to produce an averaged output voltage (for the motor coil, usually only a couple of volts), using the current as a reference, so the current remains 'constant' at that particular microstep.
Because of this, while the (maximum) current may be set to 1100mA, most of the time the stepper driver won't be drawing that much current, and averaged out over time will be drawing considerably less. Which is why the 10A fuse on V_FUSED is capable of supplying sufficient current for the 12V Regulator, 5V regulator, Stepper drivers (including external driver header), OUT 1 and OUT 2 headers, V_OUTLC1 and V_OUTLC2 selection jumpers.
Load generally doesn't matter; either the motor has the torque to overcome the load, or it doesn't and will skip steps. Speed is different, as torque reduces with speed, and there are various other complications such as back EMF, but that would require a much more complex answer, and I'm not an electrical engineer!
(Note: the above is my understanding of how a stepper driver works. I'm happy to be corrected.)
I had some previous issues with a motor stalling, and when I measured, I realized that the current draw was only about half of the current I specified in config. I experimented with increasing the current, and I only reached a full current draw when I set it to 2x the rated peak current of the motor.
I think the problem is how you're measuring the current. How are you doing that? If you're using an oscilloscope, you could see the current changing, depending on where you are measuring. If you're using a multimeter, you're probably seeing the average current over time, ie half the current. If the motor is stationary, it will depend on where in the step it is stopped.
Setting the current to double the recommended motor current is going to overheat the stepper motor and burn out the wiring, and cause a phase short error message. It may also damage the stepper driver. Don't do it!
Ian