Duet2 WiFi shuts down spontaneously during printing job
-
During a three days printing job my Duet 2 WiFi decided to shut down without any visible reason.
After turning the 3d printer on again, I got the diagnostic information below (see "===Platform==="):
Now I am lost in my incertainty, wheather I should retry the job, since it is so time consuming and would like to have a clue why this happened.
Thanks in advance.
M122
=== Diagnostics ===
RepRapFirmware for Duet 2 WiFi/Ethernet version 3.4.6 (2023-07-21 14:08:28) running on Duet WiFi 1.02 or later
Board ID: 0JD0M-9P6M2-NW4SS-6JKDA-3SN6M-9UW7K
Used output buffers: 1 of 26 (19 max)
=== RTOS ===
Static ram: 23896
Dynamic ram: 75116 of which 0 recycled
Never used RAM 13068, free system stack 180 words
Tasks: NETWORK(ready,14.8%,242) HEAT(notifyWait,0.0%,333) Move(notifyWait,0.0%,363) MAIN(running,84.9%,446) IDLE(ready,0.2%,30), total 100.0%
Owned mutexes: WiFi(NETWORK)
=== Platform ===
Last reset 00:09:56 ago, cause: power up
Last software reset at 2024-03-31 14:45, reason: StuckInSpinLoop, GCodes spinning, available RAM 9332, slot 1
Software reset code 0x4083 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f80f BFAR 0xe000ed38 SP 0x2000231c Task MAIN Freestk 763 ok
Stack: 0042c6eb 0045747c 61000000 3f800000 3f13f800 422cc5f7 3edba1b7 3331bb4c 00000000 20000c98 00000001 00000001 00000028 20004c64 40090000 0042c96d 00000000 00000001 00002283 0042d0fb 0042dd44 9f524ef3 3f800000 0077ffff 422cc5f7 00000000 00000000
Error status: 0x00
Aux0 errors 0,0,0
Step timer max interval 0
MCU temperature: min 36.9, current 39.8, max 40.8
Supply voltage: min 23.9, current 24.0, max 24.2, under voltage events: 0, over voltage events: 0, power good: yes
Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0
Events: 0 queued, 0 completed
Driver 0: standstill, SG min n/a
Driver 1: standstill, SG min n/a
Driver 2: standstill, SG min n/a
Driver 3: standstill, SG min n/a
Driver 4: standstill, SG min n/a
Driver 5:
Driver 6:
Driver 7:
Driver 8:
Driver 9:
Driver 10:
Driver 11:
Date/time: 2024-03-31 15:59:32
Cache data hit count 4294967295
Slowest loop: 15.51ms; fastest: 0.18ms
I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
=== Storage ===
Free file entries: 9
SD card 0 detected, interface speed: 20.0MBytes/sec
SD card longest read time 1.7ms, write time 3.2ms, max retries 0
=== Move ===
DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, comp offset 0.000
=== MainDDARing ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== AuxDDARing ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0
=== GCodes ===
Segments left: 0
Movement lock held by null
HTTP is idle in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
Daemon is idle in state(s) 0
Autopause is idle in state(s) 0
Code queue is empty
=== Network ===
Slowest loop: 15.59ms; fastest: 0.00ms
Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0)
HTTP sessions: 1 of 8
= WiFi =
Interface state: active
Module is connected to access point
Failed messages: pending 0, notready 0, noresp 0
WiFi firmware version 1.27
WiFi MAC address ec:fa:bc:dd:c5:3b
WiFi Vcc 3.40, reset reason Turned on by main processor
WiFi flash size 4194304, free heap 22240
WiFi IP address 192.168.178.52
WiFi signal strength -63dBm, mode 802.11n, reconnections 0, sleep mode modem
Clock register 00002002
Socket states: 0 0 0 0 0 0 0 0 -
Have you tried simulating the print? If there is a problem with the gcode, sd card or possibly a bug that is being triggered by that print, it might show up.
-
@gloomyandy Yes I did, did not notice anything abnormal.
But I did not mention that I never had a print job that would take that long. It stopped about 1 and a half day after start. The longest print jobs I have run sofar are about 1 day.Also let me remark that I slightly changed the speed settings specially for this job to run smoother (lower speeds, lower accelerations and jerk values), in order to rule out any layer shift due to sudden movements (since I do not want to expose the job to risks), so to say that I don't expect stepper to run too hot. I did actually checked their temperature and found them as usual.
What does "reason: StuckInSpinLoop" actually mean, if you happen to know?
-
@Triet said in Duet2 WiFi shuts down spontaneously during printing job:
What does "reason: StuckInSpinLoop" actually mean, if you happen to know?
It means pretty much what it says. Many parts of RRF have code that needs to be executed by the main task, these are typically called in a so-called "SpinLoop" basically a function called by the main loop of the main task. The main task monitors if any of these calls take a long time to execute and if they do it triggers an exception like the one you have. In this particular case the code in question was the main gcode processing code that "got stuck/took a long time", this may be caused by a bug, or could be the result of a hardware fault (something like a bad SD card for instance).
If multiple simulations run ok, your SD card is probably ok. It might be worth pulling the card and using a PC to run a scandisk or whatever on it to check for possible errors. You could I suppose try executing the code without any filament loaded, that way you will not be wasting filament if it fails again. Other than that I don't think there is much else you can do to test things before trying another print.
-
@gloomyandy said in Duet2 WiFi shuts down spontaneously during printing job:
Other than that I don't think there is much else you can do to test things before trying another print.
Reply
I discovered a similar error message in this post:
https://forum.duet3d.com/topic/17835/stuck-in-spin-loop-spinning-module-g-codes/14
where it was caused by noise to the unshielded wires, particularly, the two-wires normally open connection to the end stops. This applies to my situation too. I will have to engage in shielding or twisting or whatever applyies while protecting these wires against electromagnetic interference.
Is any type of shielded two wires connection cable known to you that you can recommend or is this such a trivial issue that I should find a suitable cable readily available?
I will also check connections to ground - this is a DIY printer and I never cared about that. Is there a link regarding grounding the Duet2 WiFi and components (stepper case, whatever). And should all these ground points also connect to mains ground?
-
@Triet said in Duet2 WiFi shuts down spontaneously during printing job:
it was caused by noise to the unshielded wires, particularly, the two-wires normally open connection to the end stops. This applies to my situation too.
Are you able to change your endstop wiring to use normally-closed contacts? When closed the connections are much less sensitive to interference.
-
@dc42 said in Duet2 WiFi shuts down spontaneously during printing job:
Are you able to change your endstop wiring to use normally-closed contacts? When closed the connections are much less sensitive to interference.
If the Duet2 WiFi is able to do that, then I am too
Now I can understand a number of sudden layer shifting (seldom bur large shifts, always X axis). They were definitely not caused by sliding resistance, low tensioned belts, too low stepper current or violent movements - I made sure that would not happen. As the tool moves, I have observed that the switch arm dangles (but that may be unrelated).
I am considering disabling the endstops after homing and during the print, as long as I don't get the properly shielded two wire cables, but cannot find the correct g-command until now.
-
@Triet
Now I am responding to myselfI found that the LED is illuminated when the endstop switch is connected and triggered (otherwise unlit). If I understand the documentation correctly, this means that my endstop switches are "normally open" (NO).
I also found that they were connected to the Duet2 via 2 wires only. The 3.3V wire (in the middle of the terminal) was not connected. It was working nevertheless, at least most of the time. This might have made the cable more vulnerable to electromagnetic noise.
Do I correctly assume that I will have to replace my endstops in order to have the endstop switsches of the type NC?
For the time being I have twisted all three wires, but I cannot figure out the command the disable triggering of the endstops (similar to Marlin's M121), just to be save against layer shifts during printing due to erratic triggering until I get the NC type of endstops.
-
@Triet
OK, I rewired the connection to the endstops and changed the configuration from
M574 X1 S1 P"!xstop"
M574 Y2 S1 P"!ystop"
to
M574 X1 S1 P"xstop"
M574 Y2 S1 P"ystop"
so the microswitches are now "normally closed" and therefore less vulnerable to noise. The LEDs confirm that. Homing works fine.
Case close, unless you disagree. -
@Triet Sorry to be back again
I had a shutdown again, no apparent cause.
Now, trying to diagnose I issued the M122 command, but I get no output...What might be wrong?
The printer is back after switching it on and shows no other signs of irregularities.
But before I restart this job (it takes several days) I would like at least to know what is happening with my Duet2 WiFi that is not showing any diagnose information.Thanks guys
-
@Triet After forcing the console to output just "Test using M291 P"Test", the M122 also started to work as expected. Strange!
Anyway, diagnosing the cause of my last spontaneous shutdown (after longer than a day), I found:
=== Platform ===
Last reset 01:08:02 ago, cause: power up
Last software reset at 2024-04-09 20:28, reason: StuckInSpinLoop, GCodes spinning, available RAM 9212, slot 2
Software reset code 0x4083 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f80f BFAR 0xe000ed38 SP 0x2000231c Task MAIN Freestk 763 ok
Stack: 0042c6eb 0045747c 61000000 3f800000 3e88b400 a0000000 bfe6a7ef 3331bb4c 00000000 20000c98 00000001 00000001 00000028 20004c64 40090000 0042c96d 00000000 00000001 00002283 0042d0fb 0042dd34 ea9b4b6c b5dde12a 0077ffff bb360b61 00000000 00000000
Error status: 0x04
Aux0 errors 0,0,0I have no idea what that means exactly but it looks like a software crash.
I might be overthinking it but both times when I had this shutdown it happened after longer than a day of printing, while the job was about 30%.
Is there a timeout built-in somewhere limiting how long jobs can run?
Currently, I am unable to print a specific model I need, because it takes that long. This is the first time I ever printed anything taking over a day (this job: longer that two days).
-
It may be worth trying the 3.5 RC4 that just released for testing. You may want to try a dry run without filament first to see if it can complete.
Have you been able to simulate the job successfully?
-
@Triet did you have the Duet USB port connected to anything? It's possible to get "stuck in spin loop" if the firmware is trying to generate debug output and the USB port is connected to a PC but no program on the PC is reading it.
-
@dc42 said in Duet2 WiFi shuts down spontaneously during printing job:
did you have the Duet USB port connected to anything?
I am now aware of M111 for diagnosing purposes but have not used it. I am afraid that I could be loosing my time since I am not acquainted with the interpretation of the output.
So no, nothing connected to USB.
-
@Phaedrux said in Duet2 WiFi shuts down spontaneously during printing job:
t may be worth trying the 3.5 RC4 that just released for testing.
That is a good idea. If it happens again I will try it.
My "simulation" consists of printing the model in parts and glue them when I am finished. I just measure the height up to the shutdown, then move the model down by the same amount in the slicer plate, and restart the job. This is not optimal and I hope I can glue the parts in an acceptable way.
Just at this very moment I am configuring the "resurrection" function: Letting the job continue just at the point where the shutdown occurred - assuming this shutdown was interpreted as a power failure and not as an intended interruption. That means, that during the print the current status is continuously saved to the SD card - something I dislike. This is all trial and error.
-
@Triet said in Duet2 WiFi shuts down spontaneously during printing job:
Just at this very moment I am configuring the "resurrection" function:
This is worth trying:
911 S23 R24 P"M913 X0 Y0 G91 M83 G1 Z3 E-5 F1000"Actually, the current status does not need to be constantly saved to SD card. Using an oversized Meanwell power supply with 600 W I hope this would work well, assuming that writing to SD is fast enough.
-
@Triet said in Duet2 WiFi shuts down spontaneously during printing job:
This is worth trying:
911 S23 R24 P"M913 X0 Y0 G91 M83 G1 Z3 E-5 F1000"Just to let you know: The resurrection function could not be implemented satisfactorily.
With following setting in the config.g:
911 S22 R24 P"M913 X0 Y0 G91 M83 G1 Z3 E-8 F1000"and with a resurrect-prologue.g file like this:
M116 ; wait for temperatures
G28 X Y ; home X and Y, hope that Z hasn't moved
M83 ; relative extrusion
G91
G1 Z-3 E8 F3600 ; undo the retraction that was done in the M911 power fail scriptthe printer does everything as commanded, but it resumes printing at a slightly higher Z position, so it ends up printing in midair. It does not work, probably because after power loss the bed slides a bit downwards. Without a max endstop switch to re-home the Z axis this function is unusable.
-
@Triet said in Duet2 WiFi shuts down spontaneously during printing job:
Without a max endstop switch to re-home the Z axis this function is unusable.
Or set the Z position manually.
-
@Phaedrux said in Duet2 WiFi shuts down spontaneously during printing job:
Or set the Z position manually.
Yes, but: which are the coordinates saved at time of power outage?