Mid print hang 3.4.0b2, duet3, sbc
-
I'm getting random print cancellations with my setup. The console on the duet display reads:
Lost connection to Linux
cancelled printing file ...
Connection to Linux established!All with the same timestamp and without running the cancel print gcode.
I need this machine running ASAP for production, so any advice on moving back to a stable firmware would be appreciated.
Config:
G4 S2 G90 ; send absolute coordinates... M83 ; ...but relative extruder moves M550 P"CMW3D-1" ; set printer name M669 K1 ; select CoreXY mode ; Drives M569 P0.0 S1 ; physical drive 0.0 goes forwards M569 P0.1 S0 ; physical drive 0.1 goes forwards M569 P0.2 S0 ; physical drive 0.2 goes forwards M569 P0.3 S0 ; physical drive 0.3 goes forwards M569 P0.4 S0 M569 P0.5 S1 M584 X0.0 Y0.1 Z0.2:0.3:0.4 E0.5 ; set drive mapping M350 X16 Y16 Z16 I1 M350 E16 I0 ; configure microstepping with interpolation M92 X160.00 Y160.00 Z1600.00 E710.26 ; set steps per mm M566 X1500.00 Y1500.00 Z60.00 E150.00 ; set maximum instantaneous speed changes (mm/min) M203 X25000.00 Y25000.00 Z900.00 E30000.00 ; set maximum speeds (mm/min) M201 X9000.00 Y9000.00 Z20.00 E1000.00 ; set accelerations (mm/s^2) M906 X1400 Y1400 Z1400 E800 I50 ; set motor currents (mA) and motor idle factor in per cent M84 S30 ; Set idle timeout ; Axis Limits M208 X0 Y0 Z0 S1 ; set axis minima M208 X375 Y393 Z550 S0 ; set axis maxima ; Endstops M574 X1 S1 P"io6.in" ; configure active-high endstop for low end on X via pin io6.in M574 Y1 S1 P"io8.in" ; configure active-high endstop for low end on Y via pin io8.in M574 Z1 S2 ; configure Z-probe endstop for low end on Z ; Z-Probe M950 S0 C"io5.out" ; create servo pin 0 for BLTouch M558 P9 C"io5.in" H15 F600 T6000 ; set Z probe type to bltouch and the dive height + speeds G31 P500 X-7.5 Y-20 Z3.4 ; set Z probe trigger value, offset and trigger height M557 X15:330 Y15:370 S18 ; define mesh grid ; Heaters M308 S0 P"temp0" Y"thermistor" T100000 B4138 ; configure sensor 0 as thermistor on pin temp0 M950 H0 C"out1" T0 ; create bed heater output on out0 and map it to sensor 0 M307 H0 B1 S1.00 ; enable bang-bang mode for the bed heater and set PWM limit M140 H0 ; map heated bed to heater 0 M143 H0 S120 ; set temperature limit for heater 0 to 120C M308 S1 P"temp1" Y"thermistor" T500000 B4723 C1.196220e-7 H20 ; configure sensor 1 as thermistor on pin 121.temp0 M950 H1 C"out2" T1 ; create nozzle heater output on 121.out0 and map it to sensor 1 M307 H1 B0 S1.00 ; disable bang-bang mode for heater and set PWM limit M143 H1 S350 ; set temperature limit for heater 1 to 280C ; Fans M950 F0 C"out4" Q500 ; create fan 0 on pin out4 and set its frequency M106 P0 S0.7 H1 T45 ; set fan 0 value. Thermostatic control is turned on M950 F1 C"out5" Q500 ; create fan 4 on pin 122.out1 and set its frequency M106 P1 S1 H1 T45 ; set fan 4 value. Thermostatic control is turned on M950 F2 C"out6" Q500 ; create fan 0 on pin out4 and set its frequency M106 P2 S0.7 H1 T45 M950 F3 C"out3" Q500 ; create fan 0 on pin out4 and set its frequency ; Tools M563 P0 D0.5 H1 F3 ; define tool 0 G10 P0 X0 Y0 Z0 ; set tool 0 axis offsets G10 P0 R0 S0 ; set initial tool 0 active and standby temperatures to 0C ; Custom settings are not defined ; Miscellaneous M575 P1 S1 B57600 ; enable support for PanelDue M501 ; load saved parameters from non-volatile memory T0 ; select first tool M671 X-50:163:440 Y0:473:0 S25 ; leadscrews at rear left, front middle and rear right M575 P1 B57600 S1 M572 D0 S0.05
-
The SBC doesn't seem to be losing power or restarting when this happens, as I maintained a putty connection through the most recent problem.
m122 immediately after stopping:
m122 === Diagnostics === RepRapFirmware for Duet 3 MB6HC version 3.4.0beta2 (2021-08-03 12:42:33) running on Duet 3 MB6HC v1.01 or later (SBC mode) Board ID: 08DJM-956BA-NA3TJ-6JTDD-3S06N-KU8LS Used output buffers: 1 of 40 (26 max) === RTOS === Static ram: 151128 Dynamic ram: 62208 of which 216 recycled Never used RAM 134000, free system stack 127 words Tasks: SBC(ready,3.9%,310) HEAT(notifyWait,0.1%,326) Move(notifyWait,1.8%,262) CanReceiv(notifyWait,0.0%,943) CanSender(notifyWait,0.0%,361) CanClock(delaying,0.1%,334) TMC(notifyWait,58.3%,59) MAIN(running,35.7%,922) IDLE(ready,0.2%,29), total 100.0% Owned mutexes: HTTP(MAIN) === Platform === Last reset 01:54:18 ago, cause: software Last software reset at 2021-08-19 13:40, reason: User, Platform spinning, available RAM 134072, slot 0 Software reset code 0x0000 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task SBC Freestk 0 n/a Error status: 0x00 Aux0 errors 0,0,0 Step timer max interval 136 MCU temperature: min 26.4, current 27.6, max 37.1 Supply voltage: min 23.9, current 24.0, max 24.1, under voltage events: 0, over voltage events: 0, power good: yes 12V rail voltage: min 12.0, current 12.1, max 12.1, under voltage events: 0 Heap OK, handles allocated/used 99/0, heap memory allocated/used/recyclable 2048/36/36, gc cycles 0 Driver 0: position 46018, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/927 Driver 1: position -5508, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/905 Driver 2: position 480, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/1016 Driver 3: position 0, standstill, reads 35862, writes 17 timeouts 0, SG min/max 0/188 Driver 4: position 0, standstill, reads 35863, writes 17 timeouts 0, SG min/max 0/1023 Driver 5: position 0, standstill, reads 35863, writes 17 timeouts 0, SG min/max 0/452 Date/time: 2021-08-19 15:34:34 Slowest loop: 66.88ms; fastest: 0.04ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 37.5MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 125, segments created 11, maxWait 22003ms, bed compensation in use: none, comp offset 0.000 === MainDDARing === Scheduled moves 2729, completed moves 2729, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 5], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1 === GCodes === Segments left: 0 Movement lock held by null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File* is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue* is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty === CAN === Messages queued 61729, received 0, lost 0, longest wait 0ms for reply type 0, peak Tx sync delay 0, free buffers 49 (min 49), ts 34291/0/0 Tx timeouts 0,22,34290,0,0,27414 last cancelled message type 30 dest 127 === SBC interface === State: 4, failed transfers: 2, checksum errors: 398 Last transfer: 2ms ago RX/TX seq numbers: 64933/1590 SPI underruns 415, overruns 20 Disconnects: 4, timeouts: 0, IAP RAM available 0x2c690 Buffer RX/TX: 0/0-0 === Duet Control Server === Duet Control Server v3.4-b2 Code buffer space: 4096 Configured SPI speed: 8000000Hz Full transfers per second: 39.16, max wait times: 54.4ms/9.9ms Codes per second: 0.85 Maximum length of RX/TX data transfers: 2924/1044
-
Here is the DWC output when the stall happens. The event a 1:13 was during a print, the event at 1:16 was after.
-
Can you set up some additional monitoring on the pi?
https://duet3d.dozuki.com/Wiki/Getting_Started_With_Duet_3#Section_Monitoring_optional
-
@phaedrux started DCS with debugging and imediately saw this: I'm going to run a print and see what happens.
[warn] Bad data CRC32 (expected 0x7122d0a5, got 0x68c5ab97) [warn] Bad data CRC32 (expected 0xc31600e2, got 0xe7708562) [warn] Bad data CRC32 (expected 0xc8a7c26b, got 0x5b1340ab) [warn] Bad data CRC32 (expected 0x312c52a0, got 0x2c1f8ef5) [warn] Bad data CRC32 (expected 0xd1710152, got 0x1c81bcd5)
edit: couldn't get into the web interface. Got this:
System.OperationCanceledException: Board is not available (no header) at DuetControlServer.SPI.DataTransfer.ExchangeHeader() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 1436 at DuetControlServer.SPI.DataTransfer.PerformFullTransfer(Boolean connecting) in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 194 [info] Connection to Duet established [debug] Updated key spindles [debug] Requesting update of key state, seq 0 -> 1 [debug] Updated key state [debug] Requesting update of key tools, seq 0 -> 5 [debug] Updated key tools [debug] Requesting update of key volumes, seq 0 -> 0 [debug] Updated key volumes [debug] Requesting update of key boards, seq 0 -> 1087 [debug] Updated key boards [debug] Requesting update of key directories, seq 0 -> 0 [debug] Updated key directories [debug] Requesting update of key fans, seq 0 -> 7 [debug] Updated key fans [debug] Requesting update of key global, seq 0 -> 0 [debug] Updated key global [debug] Requesting update of key heat, seq 0 -> 10 [debug] Updated key heat [debug] Requesting update of key inputs, seq 0 -> 25 [debug] Updated key inputs [debug] Requesting update of key job, seq 0 -> 15 [warn] Bad data CRC32 (expected 0x3534c24d, got 0xc6f735d8) [debug] Updated key job [debug] Requesting update of key move, seq 0 -> 54 [warn] Bad data CRC32 (expected 0x390f64ca, got 0xffa85249) [debug] Updated key move [debug] Requesting update of key network, seq 0 -> 3 [debug] Updated key network [debug] Requesting update of key sensors, seq 0 -> 8 [debug] Updated key sensors [warn] Bad data CRC32 (expected 0x4c04f1ba, got 0x11d51962) [warn] Bad data CRC32 (expected 0x4bd6bb59, got 0x5ca43599) [warn] Bad data CRC32 (expected 0x4bd6bb59, got 0x01651ad2) [warn] Bad data CRC32 (expected 0xe75e751f, got 0x20b97c2f) [warn] Bad data CRC32 (expected 0x0259194f, got 0x1dc9c3ca) [warn] Bad data CRC32 (expected 0x59691b91, got 0xd61c979f) [fatal] Abnormal program termination [fatal] SPI task faulted System.Exception: RepRapFirmware refused message format at DuetControlServer.SPI.DataTransfer.ExchangeHeader() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 1540 at DuetControlServer.SPI.DataTransfer.PerformFullTransfer(Boolean connecting) in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 194 at DuetControlServer.SPI.Interface.Run() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/Interface.cs:line 1026 at DuetControlServer.Utility.PriorityThreadRunner.<>c__DisplayClass0_0.<Start>b__0() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/Utility/PriorityThreadRunner.cs:line 25 [fatal] SPI task faulted System.Exception: RepRapFirmware refused message format at DuetControlServer.SPI.DataTransfer.ExchangeHeader() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 1540 at DuetControlServer.SPI.DataTransfer.PerformFullTransfer(Boolean connecting) in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/DataTransfer.cs:line 194 at DuetControlServer.SPI.Interface.Run() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/SPI/Interface.cs:line 1026 at DuetControlServer.Utility.PriorityThreadRunner.<>c__DisplayClass0_0.<Start>b__0() in /home/christian/Duet3D/DuetSoftwareFramework/src/DuetControlServer/Utility/PriorityThreadRunner.cs:line 25 [debug] Update task terminated [debug] IPC task terminated [debug] Job task terminated [debug] Periodic updater task terminated [info] Application has shut down
-
Try a new SD card with a fresh duetpi image.
-
@phaedrux it seems turning on the berd air pump is causing the problem. Several times now I've switched on the berd air pump just to have the disconnect happen shortly after. I don't have a flyback diode on the pump, and originally I've had it connected to out3, but I tried moving it to out9 which should have an onboard flyback diode, and even now I have moved it on an external mosfet board with the machine still stopping once the pump turns on.
-
@zakm0n It sounds a lot like the SPI communication is interrupted by external sources. You may want to try to move other cables away from the SBC cable, replace it with a shorter one (if possible), or reduce the SPI frequency in
/opt/dsf/conf/config.json
. -
@chrishamm I tried lowering the SPI frequency to half it's default. The SBC cable is run underneath the duet, so it should be well protected. I'm still getting disconnects because of SPI interruptions.
-
@chrishamm So, I have the chassis of the printer bonded to the mains earth, and after running a grounding wire to the body of the berd air pump, I'm now able to get about 3 hours into a print, but I'm still getting disconnects somewhere around the 3.5-4 hour mark into a print. Any other grounding you might suggest to get rid of noise? The SBC cable is run underneath of the Duet 3 and there's not really any way to further isolate it from the wiring of the machine. I'm at a loss here, and I'm considering ditching the Duet3 for something else going forward, as my personal machines with Klipper have absolutely never had such an issue, even in much less ideal setups.
-
@zakm0n said in Mid print hang 3.4.0b2, duet3, sbc:
The SBC cable is run underneath of the Duet 3 and there's not really any way to further isolate it from the wiring of the machine.
Foil tape as shielding?
-
@phaedrux I could try that, but I'll have to re-route the cable from behind the board. Foil tape has a way of shorting out things, doesn't it? I just can't understand how I'm seemingly the only one to ever have this problem.
-
@zakm0n said in Mid print hang 3.4.0b2, duet3, sbc:
Foil tape has a way of shorting out things, doesn't it?
A layer of non-conductive tape on top? There are purpose made shielded ribbon cables as well.
@zakm0n said in Mid print hang 3.4.0b2, duet3, sbc:
I just can't understand how I'm seemingly the only one to ever have this problem.
There are maybe a few other cases of similar interference I can think of, but it's usually been with a Duex ribbon cable or Paneldue picking up some noise.
The fact you're getting an improvement by increasing the grounding is promising though.
-
@zakm0n I grab Vin and GND for the Pi's buck converter relatively close from the Duet's Vin power connector in order to minimise potential interference.
I've been printing A LOT in SBC mode (countless prints > 7h) and I have not observed any connection drops on my setup. Do you get CRC errors when you bend or slightly twist the ribbon cable? If yes, try replacing it - a bad cable is the most plausible reason for your connection drops.
-
@chrishamm The PI has an external wall wart currently. The ribbon cable isn't in a place where it can be moved easily, so that hasn't been something I can test. Adding the grounding strap to the Berd pump got me from 10 minutes to 4 hours though, so I'm thinking grounding and shielding is where I should be focusing.
-
@phaedrux So, I managed to get the ribbon cable wrapped in metal foil tape and successfully printed a 6 hour print yesterday. I tried starting a print today and only made 10 minutes. It's like all of the gains I made by grounding and such just disappeared. It seems like this time there were a bunch of rapid connects and disconnects in the console on my paneldue.
-
And just as a verification if you have the berd air completely off the prints sill consistently complete without issue?