How should the firmware respond to a heater fault?
-
I'm coming close to finalising firmware 1.20 and one of the items that needs to be sorted is how to respond to a heater fault. Here is a possible sequence of responses to a heater fault that occurs when a print from SD card is in progress:
1. Turn off the faulting heater.
2. Execute pause.g to move the head away from the print. This will create a resurrect.g file.
3. Turn off all extruder heaters, leaving the bed and chamber heaters running (except for the heater that faulted) to maximise the chance of being able to restart the print.
4. Pop up a message telling the user what happened. If logging is enabled, log the event.
5. Wait for a configurable timeout period (which could be zero) to allow the user to clear the heater fault and resume the print.
6. If the timeout expires, turn all heaters and motors off, also try to turn off power using M81.
Would this satisfy all the common use cases? Any other suggestions?
-
I like step 2, had a heater fault myself due to a loose connector and didn't like the nozzle being left in the print.
If a panel due was connected or PC was on could you also play an audible alarm noise?
-
I'm a little confused, are there two timeouts being considered here? There is a time out that has to be exceeded for the fault to be raised at the moment on 1.19.2, then an additional timeout to raise step six?
Step 6 is a real positive step towards minimising the risk for a thermal runaway and reducing the need to permanently babysit the machine. A 10hr+ build is I assume not unusual and many wouldn't get through that without needing a #2!
-
I think that covers a very sensible sequence which allows for maximum recoverability without compromising safety in any major way.
-
This sounds great. All the safety and flexibility to continue.
-
Sounds good, as long as we can disable the (eventual) bed heater timeout if we don't want it.
-
Was this implemented? I want to be able to shut down the heaters using M81 on heater fault but can't find the gcode to set this up?
-
Continued my a dig around and thought to check the 1.20 release notes on the following link.
https://github.com/dc42/RepRapFirmware/blob/dev/WHATS_NEW.md
This has the comment:
"When a heater fault occurs, the print is now paused and all heaters are turned off except bed and chamber heaters. After a timeout period, the print is cancelled, all remaining heaters are turned off, and the firmware attempts to turn the power off as if M81 had been received."
I am guessing this timeout value is the timeout mentioned to cancel a print:
"Heater fault timeout to cancelling print is now configurable (M570 S parameter, in minutes)"
Why does the gcode reference refer to the M570 as the time in seconds the heater has to reach a set temperature before a fault was raised in seconds? From the release notes I was expecting M570 to be time in minutes after a fault before a build is cancelled and M81 called?
https://duet3d.dozuki.com/Wiki/Gcode#Section_M570_Configure_heater_fault_detection
-
I like the sequence.
-
Something i would like:
The Error message is not very clear..had 1 week ago a heater fault every large print. Bed shut off ( thats bad because i use a Ultrabase so the print popped off when bed is cold) but i cant tell where exactly was the problem..would be fine if you can read the temp chart because..if temp dropped 100° within a sec its clearly the thermistor..if temp drops more slowly its the heater cartridge.
Btw..that black plugs on Smart effector sucks. only small chance to get a faulty pin out of the plug housing. Wished we had there the other plugs too
-
@barracuda72 Step 3 would leave bed ON, therefore covering your use case.
In general I would replace the step 6 by a "Macro" (something like on heaterfault0.g). This macro could include the actions as described on M81 or be configured by the user to its preferences.
It is not the same a heater fault on the Bed than a heater fault on the nozzle. The basic security steps could cover any failure modes, but a macro associated to the heaters would be the more flexible option and would give room for improvements without further code changes required. Something like the Toolchange macros for each defined tool. -
I like the idea of replacing the sequence of steps with a macro or macros. That is in line with the general RepRepFirmware philosophy. We can have the current behaving in the macro(s) as a default so no change is noticed unless someone wants to change their macros.
(also p.s. zombie thread revival!)
-
@t3p3tony said in How should the firmware respond to a heater fault?:
(also p.s. zombie thread revival!)
Ish. I was asking about the current implementation of the feature as it isn't clear from the release notes and the release notes don't tally with the duet g-code documentation.
-
@t3p3tony
replacing the sequence with a macro could be dangerous:
Assuming a faulty / detached thermistor, if someone made an incorrect / empty heaterfault.g file, which would not turn off the faulty and eventually all heaters, it would quite probably cause a fire ... IMHO some safety features should never be modified by mere mortals / end users...Plus if I'm correct quite a few functionality is missing to implement this .... the gcode would have to receive props (variables / constants) for this (describing the faulty heater) and accept variables / constants as parameters to be able to turn only the faulty heater off... and there is the problem of the timeout also to turn all heaters off at the end... This timeout should be able to be interrupted and not execute the code to turn all heaters off, if the heater fault is cleared....
So:
Function defining
variable handling
variable passing
reference passing
conditional execution ( if ... else )
loops, (iteration ... it is not structly required, but its absence makes the code ugly)function heaterfault (*heaterFault, faultyHeater, allHeaters) {
heater_off(faultyHeater);
cnt = 3600;
while (cnt -- > 0 && *heaterfault == true) {
sleep 1;
};
if (*heaterfault == true) {
for (heater, allHeaters) {
heater_off(heater);
};
};
};The code is very incomplete, it just serves to illustrate he scope of the task...
-
possibly we can get part of the way by having a macro for a heater fault on each heater, but i take you point about a user misconfiguring it. its a trade off between usability in some cases against people misconfiguring something where the defaults are sensible.
-
Can some one please clear up my question regarding the difference between the release notes and the wiki?!
Regards safetys I can't see why it shouldn't be simple. Safe as possible by default with the option to opt out if required. Make this clear in the release notes and we should only have a few months of posts questioning why things aren't working like the recent no move before homing!
-
@t3p3tony
My opinion is that the end user should never have the freedom to set his/her equipment and consequently his/her house on fire however neglegant / short sighted he/she may be. -
My take on it, would be that any heater fault detected, should imediately re-set the printer thereby disabling all sources of heat generation i.e. bed & heater cartriges and motors.
My own recent experience of a Simplify3D remote USB print failing, hours into the print, the hot end heater stopped, but the bed & motors carried on as if nothing had happened.
I realise that being remotely driven the Web interface would not know about the print, but the Duet board had detected & raised the problem, but to my mind, didn't cleanly shut down the printer.
2 penneth
-
@dr_ju_ju said in How should the firmware respond to a heater fault?:
My take on it, would be that any heater fault detected, should imediately re-set the printer thereby disabling all sources of heat generation i.e. bed & heater cartriges and motors.
If a mosfet fails open (e.g. heater on) then a controller reset will not prevent heating - what you ideally want is a digital IO to a kill switch, cutting the mains power.