I've been running a rather long macro from the SBC and found that it cannot complete reliably. I get various symptoms and errors:
- The printer pauses or hesitates between moves for approx. 0.25 to 2 second
- Pauses are more frequent if the expression language is used, particularly loops
- Warning that the Macro has not completed in 3000ms (the macro takes 3+ minutes to run, but I don't always get this error, so its no the total macro runtime its complaining about)
Warning: Controller has been reset
- macro execution stops and I have to reset the machine
What Hardware and Software?
Duet 3 and Raspberry Pi 3B+ running 3.2 final
What debugging steps have been taken?
In order:
- Re-worked my macro to remove all scripting and expressions so its plain GCode (eliminates parsing weirdness on the SBC as root cause)
- Switch to Ethernet on the SBC, disabled WiFi
- Switched SBC power to an external power adapter, removed the 5v -> SBC jumper
- Checked that there is no SD card in the Duet
- Tried the following config changes in DCS suggested here: https://github.com/Duet3D/DuetSoftwareFramework/issues/137
- Raised the SPI frequency to 16Mhz
- Dropped the poll interval to 5ms
- Increased buffered macro codes to 32
- Moving the Pi board and ribbon cable as far away from any other wiring as is practical in case its interference
- Disable logging in the script by commenting out the
M929 S3 P"/sys/bed-sample.log"
line
Results
Increasing the SPI frequency and lowering the poll interval had a positive impact. This resulted in noticeably snappier macro execution. But the intermittent complete stalls remained and the Warning: Controller has been reset
still happened.
Turning off the logging command in the macro seems to have eliminated the stall problem. My wild guess in that DCS blocks to perform logging file I/O and this causes an SPI stall but I haven't looked at the DCS code to check. Since I need this on to capture the probing data for Python to process I cant accept turning this off as a solution (unless you know another way to get data out of the duet reliably).
Possible Next Steps
- Try a very fast SD card to see if the faster I/O speed helps
- Try raising the logging level from Info to Error and then log all data as 'Error' on the theory that this would have less overhead somehow
- Try a Pi4 on the theory that more CPU might help.
M122
I got one right before a reset SPI Shows some disconnects have occurred and there appears to be an underun logged on the Duet (if I'm reading that correctly).
1/19/2021, 1:06:45 PM Warning: Controller has been reset
1/19/2021, 1:06:33 PM M122
=== Diagnostics ===
RepRapFirmware for Duet 3 MB6HC version 3.2 running on Duet 3 MB6HC v0.6 or 1.0 (SBC mode)
Board ID: 08DJM-956L2-G43S4-6JTD4-3SS6K-TA7AH
Used output buffers: 1 of 40 (16 max)
=== RTOS ===
Static ram: 149788
Dynamic ram: 65312 of which 76 recycled
Never used RAM 143656, free system stack 122 words
Tasks: Linux(ready,87) HEAT(blocked,296) CanReceiv(blocked,834) CanSender(blocked,344) CanClock(blocked,352) TMC(blocked,19) MAIN(running,1120) IDLE(ready,19)
Owned mutexes: Telnet(MAIN)
=== Platform ===
Last reset 00:26:13 ago, cause: software
Last software reset at 2021-01-19 12:40, reason: User, none spinning, available RAM 143620, slot 0
Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0044a000 BFAR 0x00000000 SP 0x00000000 Task Linu Freestk 0 n/a
Error status: 0x00
Aux0 errors 0,0,0
Aux1 errors 0,0,0
MCU temperature: min 44.5, current 46.4, max 46.7
Supply voltage: min 24.2, current 24.3, max 24.3, under voltage events: 0, over voltage events: 0, power good: yes
12V rail voltage: min 12.2, current 12.2, max 12.3, under voltage events: 0
Driver 0: position 56774, standstill, reads 48573, writes 8 timeouts 0, SG min/max 0/1023
Driver 1: position 48723, standstill, reads 48573, writes 8 timeouts 0, SG min/max 0/1023
Driver 2: position 1600, standstill, reads 48573, writes 8 timeouts 0, SG min/max 0/1023
Driver 3: position 24800, standstill, reads 48578, writes 2 timeouts 0, SG min/max not available
Driver 4: position 0, standstill, reads 48581, writes 0 timeouts 0, SG min/max not available
Driver 5: position 0, standstill, reads 48581, writes 0 timeouts 0, SG min/max not available
Date/time: 2021-01-19 13:06:32
Slowest loop: 2882.45ms; fastest: 0.04ms
=== Storage ===
Free file entries: 10
SD card 0 not detected, interface speed: 37.5MBytes/sec
SD card longest read time 0.0ms, write time 0.0ms, max retries 0
=== Move ===
DMs created 125, maxWait 446431ms, bed compensation in use: none, comp offset 0.000
=== MainDDARing ===
Scheduled moves 1506, completed moves 1506, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 3], CDDA state -1
=== AuxDDARing ===
Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1
=== Heat ===
Bed heaters = 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamberHeaters = -1 -1 -1 -1
=== GCodes ===
Segments left: 0
Movement lock held by HTTP
HTTP* is idle in state(s) 45 0, running macro
Telnet* is doing "M122" in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger* is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
SBC is idle in state(s) 0
Daemon is idle in state(s) 0 0, running macro
Aux2 is idle in state(s) 0
Autopause is idle in state(s) 0
Code queue is empty.
=== CAN ===
Messages queued 6765, send timeouts 0, received 17182, lost 0, longest wait 1ms for reply type 6013, free buffers 48
=== SBC interface ===
State: 4, failed transfers: 0
Last transfer: 1ms ago
RX/TX seq numbers: 8122/8122
SPI underruns 0, overruns 0
Number of disconnects: 2, IAP RAM available 0x2c884
Buffer RX/TX: 1416/112-4088
=== Duet Control Server ===
Duet Control Server v3.2.0
HTTP:
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: M118 L1 S"Bed probe location: X:236.872 Y:8.128"
Buffered code: G1 X236.872 Y20.128 F9000
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: M118 L1 S"Bed probe location: X:196.872 Y:8.128"
Buffered code: G1 X196.872 Y20.128 F9000
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: M118 L1 S"Bed probe location: X:170.0 Y:8.128"
Buffered code: G1 X170.0 Y20.128 F9000
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: M118 L1 S"Bed probe location: X:130.0 Y:8.128"
Buffered code: G1 X130.0 Y20.128 F9000
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: M118 L1 S"Bed probe location: X:103.128 Y:8.128"
Buffered code: G1 X103.128 Y20.128 F9000
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
Buffered code: G30 S-1
Buffered code: G1 Z1 F252
==> 1420 bytes
Executing macro 0:/macros/mesh-bed-leveling/mesh-bed-probe.g, started by M98 P"0:/macros/mesh-bed-leveling/mesh-bed-probe.g"
> Next stack level
Code buffer space: 2696
Configured SPI speed: 16000000 Hz
Full transfers per second: 78.97
Maximum length of RX/TX data transfers: 3669/1548
ok
1/19/2021, 1:05:45 PM Connected to duet3