Multiple machines suddenly stop. Error 0x4112
-
Here's the deal. We run a lot of printers. All run 3.4 firmware and have been running perfectly for months.
Suddenly, starting today, I already had 3 printers that were happily printing and suddenly stopped dead in the water.They all generate a Software cause for the reset with error code 0x4112: Last software reset at 2022-06-08 09:04, reason: StackOverflow
Here is the full M122 for one of the machines:
=== Diagnostics === RepRapFirmware for Duet 3 MB6HC version 3.4.0 (2022-03-15 18:57:24) running on Duet 3 MB6HC v1.01 or later (standalone mode) Board ID: 08DJM-9P63L-DJ3T0-6J9FD-3S46P-9V1V8 Used output buffers: 1 of 40 (22 max) === RTOS === Static ram: 151000 Dynamic ram: 96488 of which 92 recycled Never used RAM 103092, free system stack 200 words Tasks: NETWORK(ready,25.8%,251) ETHERNET(notifyWait,0.1%,6) HEAT(notifyWait,0.0%,321) Move(notifyWait,0.0%,352) CanReceiv(notifyWait,0.0%,772) CanSender(notifyWait,0.0%,374) CanClock(delaying,0.0%,339) TMC(notifyWait,7.9%,92) MAIN(running,66.2%,925) IDLE(ready,0.0%,30), total 100.0% Owned mutexes: === Platform === Last reset 00:16:45 ago, cause: software Last software reset at 2022-06-08 09:04, reason: StackOverflow, none spinning, available RAM 99160, slot 1 Software reset code 0x4112 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0043c80e BFAR 0x00000000 SP 0x2045ffbc Task ETHE Freestk 63981 bad marker Stack: 204217ac 204217e0 00484c39 00000000 2041f388 00000000 004842a9 204217fc 20424cac 00000000 00f00000 e000e000 c0000000 00000000 004843c5 00484154 21000000 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff Error status: 0x00 Aux0 errors 0,0,0 Step timer max interval 127 MCU temperature: min 33.9, current 34.1, max 43.3 Supply voltage: min 23.5, current 23.6, max 23.7, under voltage events: 0, over voltage events: 0, power good: yes 12V rail voltage: min 12.1, current 12.2, max 12.2, under voltage events: 0 Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0 Events: 0 queued, 0 completed Driver 0: standstill, SG min 0, mspos 790, reads 34575, writes 14 timeouts 0 Driver 1: standstill, SG min 0, mspos 642, reads 34575, writes 14 timeouts 0 Driver 2: standstill, SG min 0, mspos 854, reads 34575, writes 14 timeouts 0 Driver 3: standstill, SG min 0, mspos 578, reads 34575, writes 14 timeouts 0 Driver 4: standstill, SG min 0, mspos 322, reads 34575, writes 14 timeouts 0 Driver 5: standstill, SG min 0, mspos 394, reads 34576, writes 14 timeouts 0 Date/time: 2022-06-08 09:21:04 Slowest loop: 4.36ms; fastest: 0.05ms === Storage === Free file entries: 10 SD card 0 detected, interface speed: 25.0MBytes/sec SD card longest read time 3.0ms, write time 0.0ms, max retries 0 === Move === DMs created 125, segments created 0, maxWait 0ms, bed compensation in use: none, comp offset 0.000 === MainDDARing === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 1 is on, I-accum = 0.0 === GCodes === Segments left: 0 Movement lock held by null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty === CAN === Messages queued 9060, received 12081, lost 0, boc 0 Longest wait 2ms for reply type 6053, peak Tx sync delay 170, free buffers 50 (min 49), ts 5026/5025/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 4.63ms; fastest: 0.02ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions Telnet(0), 0 sessions HTTP sessions: 2 of 8 - Ethernet - State: active Error counts: 0 0 0 0 0 Socket states: 5 2 2 2 2 0 0 0
What is happening with my printers?!?
Just a side-note: this is in a production environment. I had a job that was running for far in the 90hours when this happened. So please help me diagnose this issue before I lose more printers and days of work...
-
Ok printer 4 just had the same issue.
-
@vinculum I'm sure one of the Duet3D folks will be along shortly to try and help, but to try and get you working again quickly... If you have been running happily for months before this problem then that almost certainly means that something has recently changed to cause the problem. The trick is trying to identify what it is. Have you updated any networking components recently (it looks like the stack overflow is in the ethernet code)? Have you changed the way you work at all? Any changes to your printer configurations?
-
@gloomyandy
I understand but as you said it is not that simple. The way of work, the machines, firmware and even the gcode has not changed. As far as I know no changes were made to the network as well.
At this point I now have a total of 6 machines that had this. All in one morning. -
Each time this happens can you gather a M122 report and share it here? That would be most helpful in isolating the issue. It would also help to share the config.g file set and sample gcode to help in duplicating it.
The only other thing I can suggest in the meantime is to try updating to 3.4.1 in case a fix in there helps.