3.6.b3 mainboards as expansion boards disconnect during printing
-
Summary:
When using 3.6.b3 on a configuration using a 6HC mainboard and a 2nd 6HC connected as an expansion board -> whenever the printer goes to actually start printing the system returns an error for "Expansion Board 1 Reconnected". This causes all motor drivers on the 2nd 6HC to stop responding. M122 B1 looks something like this:
M122 B1 Diagnostics for board 1: RepRapFirmware for Duet 3 MB6HC version 3.6.0-beta.3 (2025-01-16 19:09:36) running on Duet 3 MB6HC v1.01 Last reset 00:00:21 ago, cause: software Hiccups 0 (0.00/0.00ms), segs 427 Driver 0: 80.0 steps/mm, standstill, SG min n/a, mspos 648, reads 42032, writes 14 timeouts 0 Driver 1: 80.0 steps/mm, standstill, SG min n/a, mspos 8, reads 42038, writes 11 timeouts 0 Driver 2: 800.0 steps/mm, standstill, SG min n/a, mspos 296, reads 42040, writes 11 timeouts 0 Driver 3: 80.0 steps/mm, standstill, SG min n/a, mspos 296, reads 42043, writes 11 timeouts 0 Driver 4: 80.0 steps/mm, standstill, SG min n/a, mspos 296, reads 42045, writes 11 timeouts 0 Driver 5: 80.0 steps/mm, standstill, SG min n/a, mspos 8, reads 42048, writes 11 timeouts 0 VIN: 24.2V, V12: 12.1V, MCU temperature: min 39.0C, current 39.7C, max 39.9C Peak sync jitter -1/1, peak Rx sync delay 175, resyncs 0/0, next step interrupt due in 52 ticks, disabled
NOTE: the settings listed here don't even match the configuration for the 2nd 6HC as its steps per/mm are supposed to be 727:727:1600:1600:1600. It also doesn't match the mainboard 6HC as its supposed to all be 80steps/mm.
If you run the M122 B1 after a reboot -> it returns expected values
Expected Results:
When initiating a print file or manually written gcode that follows proper formatting -> the firmware/machine should execute it as directed without issues.
Actual Results:
When initiating a print file or manually written gcode that follows proper formatting -> the machine will initiate the file and shortly after return an error for
Expansion Board 1 Reconnected
This causes all motor drivers to no longer respond on the expansion mainboard 6HC
Notes
At first I thought this could be other issues so I attempted the following:
- Swapped multiple CAN cables in -> CAN connection is fine for hours on end as long as its not printing
- Tried several Class 10 SD cards on the 2nd 6HC ranging from 4gb to 64gb with fresh formats
- Moved the GCODE file from a print file to a macro file to see if observed results changed
- Dropped Z microstepping to x8 and 800 steps/mm to see if it was a load issue.
No observed changes.
When not under a Print or Macro run status -> all drivers respond as expected and this error never occurs.
My setup is 4Y motors and 1X and 1U motors on the main 6HC with 2E and 3Z motors on the expansion 6HC.
EZ are unresponsive whenever you initiate a gcode stream via a macro(an attempted hack workaround) or whenever you are under a printing status.This is similar behavior to 3.5.4 except in 3.5.4 ANY extruder motor on the expansion mainboard will not respond AT ALL regardless of machine status.
-
@aetherialdesign can you grab an M122 directly from the 6HC that's acting as an expansion board after it's disconnected? That should give more information about what the software reset is
-
@jay_s_uk
Is that a different command than the M122 B1 (it is CAN address #1 for the expansion board)
The M122 in comment#0 is an M122 B1 pulled right after the error presents itself. -
@aetherialdesign You need to connect to the expansion 6HC directly using a USB terminal (or via the network interface if you have that set up) and then run M122 directly on the expansion board. That will return a lot more information about any possible shutdown.
-
@jay_s_uk
I pulled this via laptop connected to the 2nd 6HC over USB in Repetier. I had it connected during initial moves but it was causing the machine to move hyper stuttery so I disconnected and reconnected after the error occurred then pulled the M122.17:11:26.785 : === Diagnostics === 17:11:26.785 : RepRapFirmware for Duet 3 MB6HC version 3.6.0-beta.3 (2025-01-16 19:09:36) running on Duet 3 MB6HC v1.01 (expansion mode) 17:11:26.785 : Board ID: 08DJM-9P63L-DJMSS-6J1F6-3SN6T-KUHHB 17:11:26.785 : Used output buffers: 1 of 40 (2 max) 17:11:26.785 : === RTOS === 17:11:26.785 : Static ram: 136892 17:11:26.785 : Dynamic ram: 124960 of which 0 recycled 17:11:26.785 : Never used RAM 76660, free system stack 184 words 17:11:26.785 : Tasks: NETWORK(1,ready,10.6%,545) HEAT(3,nWait 6,0.0%,355) Move(4,nWait 6,0.0%,333) TMC(4,nWait 6,2.9%,377) CanReceiv(6,nWait 1,0.1%,682) CanSender(5,nWait 7,0.0%,334) CanClock(7,invalid,0.0%,351) MAIN(1,running,85.9%,440) IDLE(0,ready,0.5%,29) USBD(3,blocked,0.0%,137), total 100.0% 17:11:26.785 : Owned mutexes: USB(MAIN) 17:11:26.785 : === Platform === 17:11:26.785 : Last reset 00:00:19 ago, cause: software 17:11:26.785 : Last software reset at 2025-01-31 22:10, reason: HeatTaskStuck, Gcodes spinning, available RAM 84820, slot 2 17:11:26.785 : Software reset code 0x0143 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0040080f BFAR 0x00000000 SP 0x2041d500 Task Move Freestk 1044 ok 17:11:26.785 : Stack: 00000000 2041ce5c 10000000 e000e000 2041d610 0049c135 0049c92c 61030000 20421268 0000002c 2041d5f0 204329b8 a5a5a5a5 204329b0 004551d1 0045825f a5a5a5a5 20432a38 a5a5a5a5 a5a5a5a5 a5a5a5a5 3e9ffad4 41c17ae3 41c27bff 3dd366e1 41419774 41422094 17:11:26.785 : Error status: 0x00 17:11:26.785 : MCU temperature: min 40.3, current 42.8, max 42.9 17:11:26.785 : Supply voltage: min 24.2, current 24.2, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes 17:11:26.785 : 12V rail voltage: min 12.1, current 12.1, max 12.2, under voltage events: 0 17:11:26.785 : Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0 17:11:26.785 : Events: 0 queued, 0 completed 17:11:26.785 : Date/time: 2025-01-31 22:11:18 17:11:26.785 : Slowest loop: 1.04ms; fastest: 0.07ms 17:11:26.785 : USB interrupts 234 17:11:26.785 : === Storage === 17:11:26.785 : Free file entries: 20 17:11:26.785 : SD card 0 detected, interface speed: 25.0MBytes/sec 17:11:26.785 : SD card longest read time 1.6ms, write time 0.0ms, max retries 0 17:11:26.785 : === Move === 17:11:26.785 : Segments created 350, maxWait 0ms, bed comp in use: none, height map offset 0.000, hiccups added 0/0 (0.00/0.00ms), max steps late 0, ebfmin 0.00, ebfmax 0.00 17:11:26.785 : Pos req/act/dcf: 16732.80/0/0.00 0.00/0/0.00 0.00/0/0.00 0.00/0/0.00 0.00/0/0.00 0.00/0/0.00 17:11:26.785 : Peak sync jitter -1/1, peak Rx sync delay 175, resyncs 0/0, next step interrupt due in 262 ticks, disabled 17:11:26.785 : Driver 0: standstill, SG min n/a, mspos 840, reads 39232, writes 14 timeouts 0 17:11:26.785 : Driver 1: standstill, SG min n/a, mspos 8, reads 39235, writes 11 timeouts 0 17:11:26.785 : Driver 2: standstill, SG min n/a, mspos 296, reads 39235, writes 11 timeouts 0 17:11:26.785 : Driver 3: standstill, SG min n/a, mspos 296, reads 39235, writes 11 timeouts 0 17:11:26.785 : Driver 4: standstill, SG min n/a, mspos 296, reads 39235, writes 11 timeouts 0 17:11:26.785 : Driver 5: standstill, SG min n/a, mspos 8, reads 39235, writes 11 timeouts 0 17:11:26.785 : Phase step loop runtime (us): min=0, max=5, frequency (Hz): min=0, max=2212 17:11:26.785 : === DDARing 0 === 17:11:26.785 : Scheduled moves 190, completed 190, LaErrors 0, Underruns [0, 0, 0] 17:11:26.785 : Segments left 0, axes/extruders owned 0x00000000, drives owned 0x00000000 17:11:26.785 : Code queue is empty 17:11:26.785 : === DDARing 1 === 17:11:26.785 : Scheduled moves 0, completed 0, LaErrors 0, Underruns [0, 0, 0] 17:11:26.785 : Segments left 0, axes/extruders owned 0x00000000, drives owned 0x00000000 17:11:26.785 : Code queue is empty 17:11:26.785 : === Heat === 17:11:26.785 : Bed heaters -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters -1 -1 -1 -1 -1 -1 -1 -1, ordering e=== GCodes === 17:11:26.785 : Movement locks held by null, null 17:11:26.785 : HTTP is idle in state(s) 0 17:11:26.785 : Telnet is idle in state(s) 0 17:11:26.785 : File is idle in state(s) 0 17:11:26.785 : USB is ready with "M122" in state(s) 0 17:11:26.785 : Aux is idle in state(s) 0 17:11:26.785 : Trigger is idle in state(s) 0 17:11:26.785 : Queue is idle in state(s) 0 17:11:26.785 : LCD is idle in state(s) 0 17:11:26.785 : SBC is idle in state(s) 0 17:11:26.785 : Daemon is idle in state(s) 0 17:11:26.785 : Aux2 is idle in state(s) 0 17:11:26.785 : Autopause is idle in state(s) 0 17:11:26.785 : File2 is idle in state(s) 0 17:11:26.785 : Queue2 is idle in state(s) 0 17:11:26.785 : === CAN === 17:11:26.785 : Messages queued 158, received 462, lost 0, ignored 19, errs 0, boc 0 17:11:26.785 : Longest wait 0ms for reply type 0, peak Tx sync delay 0, free buffers 50 (min 50), ts 1/0/0 17:11:26.785 : Tx timeouts 0,0,0,0,0,0 17:11:26.785 : === Network === 17:11:26.785 : Slowest loop: 0.22ms; fastest: 0.00ms 17:11:26.785 : Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0) 17:11:26.785 : HTTP sessions: 0 of 8 17:11:26.785 : = Ethernet = 17:11:26.785 : Interface state: disabled 17:11:26.785 : Error counts: 0 0 0 0 0 0 17:11:26.785 : Socket states: 0 0 0 0 0 0 0 0 0 17:11:26.785 : === Multicast handler === 17:11:26.785 : Responder is inactive, messages received 0, responses 0
-
I additionally reran T0 PID tuning after this with CPAP to ensure things are stable on the heaters -> same results even with stable temps. Main difference noticed is the error appears twice now since I left it to run for a couple minutes instead of stopping it.
-
@aetherialdesign HeatTaskStuck sounds unusual, maybe @dc42 can help out