Here's the deal. We run a lot of printers. All run 3.4 firmware and have been running perfectly for months.
Suddenly, starting today, I already had 3 printers that were happily printing and suddenly stopped dead in the water.
They all generate a Software cause for the reset with error code 0x4112: Last software reset at 2022-06-08 09:04, reason: StackOverflow
Here is the full M122 for one of the machines:
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.4.0 (2022-03-15 18:57:24) running on Duet 3 MB6HC v1.01 or later (standalone mode)
Board ID: 08DJM-9P63L-DJ3T0-6J9FD-3S46P-9V1V8
Used output buffers: 1 of 40 (22 max)
=== RTOS ===
Static ram: 151000
Dynamic ram: 96488 of which 92 recycled
Never used RAM 103092, free system stack 200 words
Tasks: NETWORK(ready,25.8%,251) ETHERNET(notifyWait,0.1%,6) HEAT(notifyWait,0.0%,321) Move(notifyWait,0.0%,352) CanReceiv(notifyWait,0.0%,772) CanSender(notifyWait,0.0%,374) CanClock(delaying,0.0%,339) TMC(notifyWait,7.9%,92) MAIN(running,66.2%,925) IDLE(ready,0.0%,30), total 100.0%
Owned mutexes:
=== Platform ===
Last reset 00:16:45 ago, cause: software
Last software reset at 2022-06-08 09:04, reason: StackOverflow, none spinning, available RAM 99160, slot 1
Software reset code 0x4112 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0043c80e BFAR 0x00000000 SP 0x2045ffbc Task ETHE Freestk 63981 bad marker
Stack: 204217ac 204217e0 00484c39 00000000 2041f388 00000000 004842a9 204217fc 20424cac 00000000 00f00000 e000e000 c0000000 00000000 004843c5 00484154 21000000 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Error status: 0x00
Aux0 errors 0,0,0
Step timer max interval 127
MCU temperature: min 33.9, current 34.1, max 43.3
Supply voltage: min 23.5, current 23.6, max 23.7, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 12.1, current 12.2, max 12.2, under voltage events: 0
Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0
Events: 0 queued, 0 completed
Driver 0: standstill, SG min 0, mspos 790, reads 34575, writes 14 timeouts 0
Driver 1: standstill, SG min 0, mspos 642, reads 34575, writes 14 timeouts 0
Driver 2: standstill, SG min 0, mspos 854, reads 34575, writes 14 timeouts 0
Driver 3: standstill, SG min 0, mspos 578, reads 34575, writes 14 timeouts 0
Driver 4: standstill, SG min 0, mspos 322, reads 34575, writes 14 timeouts 0
Driver 5: standstill, SG min 0, mspos 394, reads 34576, writes 14 timeouts 0
Date/time: 2022-06-08 09:21:04
Slowest loop: 4.36ms; fastest: 0.05ms
=== Storage ===
Free file entries: 10
SD card 0 detected, interface speed: 25.0MBytes/sec
SD card longest read time 3.0ms, write time 0.0ms, max retries 0
=== Move ===
DMs created 125, segments created 0, maxWait 0ms, bed compensation in use: none, comp offset 0.000
=== MainDDARing ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== AuxDDARing ===
Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0
Heater 1 is on, I-accum = 0.0
=== GCodes ===
Segments left: 0
Movement lock held by null
HTTP is idle in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
Code queue is empty
=== CAN ===
Messages queued 9060, received 12081, lost 0, boc 0
Longest wait 2ms for reply type 6053, peak Tx sync delay 170, free buffers 50 (min 49), ts 5026/5025/0
Tx timeouts 0,0,0,0,0,0
=== Network ===
Slowest loop: 4.63ms; fastest: 0.02ms
Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions Telnet(0), 0 sessions
HTTP sessions: 2 of 8
- Ethernet -
State: active
Error counts: 0 0 0 0 0
Socket states: 5 2 2 2 2 0 0 0
What is happening with my printers?!?
Just a side-note: this is in a production environment. I had a job that was running for far in the 90hours when this happened. So please help me diagnose this issue before I lose more printers and days of work...