Solved Duet 3 Mini 5+ VIN under-voltage issues in SBC mode
-
+1 over here. Duet 3 Mini 5+ in SBC mode; SBC is powered by a step-down which is in turn powered by the same 48V supply that powers the Duet, so it's not possible for the Duet to lose power without the SBC also losing power. Occurrence seems completely random - I haven't observed a single common variable between the occurrences (apart from the fact that they all occur on the same printer, of course).
I've been contending with the issue by figuring out approximately where the print stopped, then manually editing the gcode file to pick up where it left off.
I'm loading the April 7th firmware now, and will post diagnostics if the issue occurs again.
-
@dc42 Thank you! I loaded your 7APR firmware at 10:33 AM local time today and left the board sitting idle while I worked. It's now 18:50 with zero VIN under-voltage events, whereas I had at least one during this time-frame each of the last two days (with the machine sitting idle). I'm going to let the machine sit for another 24 hours before I try printing, but, signs are positive!
-
Morning. After loading the firmware last night, I was greeted by this this morning. Machine was idle.
E-Stopped the board and has stopped the VSSA faults.
Have some tuning to do this morning, so will report back.
Regards,
Paul. -
@paulhew, please run M115 to confirm the firmware build date/time.
-
m115 FIRMWARE_NAME: RepRapFirmware for Duet 3 Mini 5+ FIRMWARE_VERSION: 3.3beta2+1 ELECTRONICS: Duet 3 Mini5plus Ethernet FIRMWARE_DATE: 2021-04-07 13:33:17
Regards,
Paul
-
@paulhew Just had these pop up
They are new!edit: and it did all of the movements and heating but nothing on the bed.
No blockage. Going back to 3.2.2edit.edit: was I supposed to upgrade the 1LC firmware also?
Downloaded 3.2.2, upload to board, now I get this.
M997 S0 Error: M997: Failed to find IAP file /opt/dsf/sd/firmware/Duet3_SBCiap32_Mini5plus.bin
P.
-
@paulhew said in Duet 3 Mini 5+ VIN under-voltage issues in SBC mode:
edit.edit: was I supposed to upgrade the 1LC firmware also?
Yes, if you run 3.3 on the main board then you need 3.3 on expansion and tool boards too.
-
-
@paulhew, thanks for your reports.
I believe I have identified an issue with the DMA controller in the microcontroller, which means that although we are able to detect the onset of the VSSA fault and associated high VIN reading, our attempts to recover from it fail. I have raised a support case with Microchip in the hope that they can provide a workaround, and I await a response.
-
-
@paulhew We've been hunting this bug down for the last couple of weeks. The VSSA errors are part of the issue; you'll probably see that the readings for VIN, Z probe and thermistors are all wildly out, too. What's happening is that the DMA channels that hold the values for temperatures, VIN, VSSA, VREF and probe get moved around, so the value for temp0 reads the value of VIN, temp1 reads the probe, temp2 reads temp0, VREF reads temp1, VSSA reads temp2, VIN reads VREF, and probe read VSSA. Naturally, the numbers are incorrect, and trigger the errors.
Unfortunately, this looks to be an error in the microcontroller itself, which is why @dc42 is contacting Microchip for help. For now, we don't need any more error reports.
As far as we are aware, this problem only affects the Duet in SBC mode; if you run in standalone, it shouldn't be an issue.
If anyone is running in SBC mode and NOT getting these issues, it would be useful to know your setup: firmware version, Duet 3 Mini 5+ board version, SBC version. I get these errors with a very basic test bench setup, and when the board is idle: Duet 3 Mini 5+ v0.5 on firmware 3.3beta (all of them), 12V PSU, Raspberry Pi 4B (official RPi 5V PSU), DWC 3.2.2, 5x motors connected, resistors in place of thermistors (just to give a constant reading).
Ian
-
@droftarts Ian, I am aware DC has been working hard on resolving this issue, I try and be as informative as possible, hence the screenshots etc.
I think I have found a new error with the latest firmware.
I have M557 defined in my config.g.
Using the latest FW, if I use Dashboard - Compensation & Calibration - Define area, it is blank.
If I set it it loses the info If I check again.My build is Meanwell 24v 300w PSU Duet 3 Mini 5 +, 1LC toolboard, BLT, 2 fans, 50w heater, slice thermistor on1LC.
1 small strip of LEDs running out of OUT1 on Mini, PanelDue.I need to get back to 3.2.2 but am struggling to find info on how to go back.
-
@paulhew I was just explaining where the problem is, as we understand it now. As such, the screenshots and even M122 reports don't tell us anything extra; they show the result, but not the cause! We've been using M122 P1007 [memory address] to peek at values in memory to work out what's happening, which is not very user-friendly; it crashes the Duet if you get it wrong, then you have to wait for the error to happen again!
The DWC issue you are seeing because you're using DWC 3.2.2 with RRF 3.3beta. You need the new DWC that recognises the new OM values in 3.3beta. At the moment, an updated DWC has not yet been released for SBC mode. There is a DWC 3.3beta for standalone, though.
I'll reply on your other thread about downgrading.
Ian
-
Edit: After about 36 hours of uptime, I got the VSSA fault error too..
I have yet to experience issues on my idling machine. I'll probably try a print tonight and see how it goes.When I initially installed the upgrade, my extruder thermistor (which is on toolboard TEMP0) was reporting something absurd like 2000C, but turning my 24v PSU off and then on again fixed that.
It's been sitting idle with no errors for the last 22 hours.Setup: 120v wall power, 24V PSU, Raspberry Pi 3B+ with the Canakit PSU, Duet Web Control 3.3.0-b2, 5x motors, live thermistors for bed/hotend. Bed is a 12v bed connected via a MOSFET board, and it's connected to one of the GPIOs rather than the big high-current bed connector. M115 as follows from up-thread:
4/8/2021, 8:21:47 AM m115 B121 Duet TOOL1LC firmware version 3.3beta2+1 (2021-04-07 10:51:19) 4/8/2021, 8:21:41 AM M115 FIRMWARE_NAME: RepRapFirmware for Duet 3 Mini 5+ FIRMWARE_VERSION: 3.3beta2+1 ELECTRONICS: Duet 3 Mini5plus Ethernet FIRMWARE_DATE: 2021-04-07 13:33:17
-
If any of you wishes to collect data in order to assist me in resolving this, please install the firmware at https://www.dropbox.com/sh/mj25l7gppbui5zl/AABLPxvI8HLr1gCzqLVlKCuea?dl=0 and either do an air print or leave the machine idle. If/when it fails:
- Report the symptoms;
- Run M122 and post the report here.
This firmware includes additional debugging nino in the M122 report.
-
In the meantime, it would be fantastic if it could pause the print instead of marking it as completed after one of these events, so that it's easier to resume. Or at the very least, it would be useful if it didn't cause the axes to lose their "homed" status.
-
M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.3beta2+1 (2021-04-09 14:40:21) running on Duet 3 Mini5plus Ethernet (SBC mode) Board ID: FDQG2-Q296U-D65J0-40KMG-1K03Z-HXFTZ Used output buffers: 1 of 40 (11 max) === RTOS === Static ram: 100072 Dynamic ram: 91712 of which 64 recycled Never used RAM 51856, free system stack 172 words Tasks: Linux(ready,115) HEAT(delaying,322) CanReceiv(notifyWait,774) CanSender(notifyWait,371) CanClock(delaying,340) TMC(notifyWait,99) MAIN(running,484) IDLE(ready,20) AIN(notifyWait,260) Owned mutexes: HTTP(MAIN) === Platform === Last reset 04:37:40 ago, cause: software Last software reset at 2021-04-09 21:48, reason: User, none spinning, available RAM 51804, slot 0 Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00430000 BFAR 0xe000ed38 SP 0x00000000 Task Linu Freestk 0 n/a Error status: 0x00 Aux0 errors 0,0,0 Aux1 errors 0,0,0 MCU revision 3, ADC conversions started 19812079, completed 19812078, timed out 0, errs 1 tnd=0 rnd=0 wrs=0 wrx=1 war=1 Supply voltage: min 0.0, current 24.3, max 24.4, under voltage events: 1, over voltage events: 0, power good: yes Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0 Driver 0: position 0, standstill, SG min/max 0/0, read errors 0, write errors 1, ifcnt 68, reads 20759, writes 20, timeouts 0, DMA errors 0 Driver 1: position 0, standstill, SG min/max 0/0, read errors 0, write errors 1, ifcnt 68, reads 20758, writes 20, timeouts 0, DMA errors 0 Driver 2: position 0, standstill, SG min/max 0/0, read errors 0, write errors 1, ifcnt 58, reads 20760, writes 18, timeouts 0, DMA errors 0 Driver 3: position 0, standstill, SG min/max 0/0, read errors 0, write errors 1, ifcnt 68, reads 20757, writes 20, timeouts 0, DMA errors 0 Driver 4: position 0, standstill, SG min/max 0/0, read errors 0, write errors 1, ifcnt 68, reads 20759, writes 20, timeouts 0, DMA errors 0 Driver 5: position 0, assumed not present Driver 6: position 0, assumed not present Date/time: 2021-04-10 02:26:25 Cache data hit count 4294967295 Slowest loop: 2.05ms; fastest: 0.12ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 0.0MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 83, maxWait 0ms, bed compensation in use: none, comp offset 0.000 === MainDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1, chamberHeaters = -1 -1 Heater 1 is on, I-accum = 0.0 === GCodes === Segments left: 0 Movement lock held by null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === CAN === Messages queued 149946, send timeouts 0, received 199896, lost 0, longest wait 2ms for reply type 6049, peak Tx sync delay 345, free buffers 17 (min 16) === SBC interface === State: 4, failed transfers: 0 Last transfer: 4ms ago RX/TX seq numbers: 6547/6547 SPI underruns 0, overruns 0 Number of disconnects: 0, IAP RAM available 0x118dc Buffer RX/TX: 0/0-0 === Duet Control Server === Duet Control Server v3.3-b2 Code buffer space: 4096 Configured SPI speed: 8000000 Hz Full transfers per second: 35.80 Codes per second: 0.00 Maximum length of RX/TX data transfers: 3148/796
-
@fletcher, thanks for that. From the report, I presume the fault was a spurious VIN undervoltage report, not a VSSA error.
Now I have a better idea of what causes the spurious undervoltage reports. I still need a M122 report from this firmware following a VSSA fault report.
-
@dc42 My machine is happy to oblige!
4/10/2021, 8:21:11 AM M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.3beta2+1 (2021-04-09 14:40:21) running on Duet 3 Mini5plus Ethernet (SBC mode) Board ID: FDQG2-Q296U-D65J0-40KMG-1K03Z-HXFTZ Used output buffers: 1 of 40 (13 max) === RTOS === Static ram: 100072 Dynamic ram: 91712 of which 64 recycled Never used RAM 51856, free system stack 172 words Tasks: Linux(ready,115) HEAT(delaying,322) CanReceiv(notifyWait,774) CanSender(notifyWait,371) CanClock(delaying,340) TMC(notifyWait,99) MAIN(running,374) IDLE(ready,20) AIN(notifyWait,260) Owned mutexes: HTTP(MAIN) === Platform === Last reset 18:32:25 ago, cause: software Last software reset at 2021-04-09 21:48, reason: User, none spinning, available RAM 51804, slot 0 Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00430000 BFAR 0xe000ed38 SP 0x00000000 Task Linu Freestk 0 n/a Error status: 0x00 Aux0 errors 0,0,0 Aux1 errors 0,0,0 MCU revision 3, ADC conversions started 79666166, completed 79666165, timed out 0, errs 4 tnd=0 rnd=0 wrs=0 wrx=4 war=4 Supply voltage: min 0.0, current 0.1, max 24.4, under voltage events: 2, over voltage events: 0, power good: no Heap OK, handles allocated/used 0/0, heap memory allocated/used/recyclable 0/0/0, gc cycles 0 Driver 0: position 0, standstill, SG min/max 0/0, read errors 0, write errors 0, ifcnt 68, reads 52912, writes 0, timeouts 0, DMA errors 0 Driver 1: position 0, standstill, SG min/max 0/0, read errors 0, write errors 0, ifcnt 68, reads 52912, writes 0, timeouts 0, DMA errors 0 Driver 2: position 0, standstill, SG min/max 0/0, read errors 0, write errors 0, ifcnt 58, reads 52911, writes 0, timeouts 0, DMA errors 0 Driver 3: position 0, standstill, SG min/max 0/0, read errors 0, write errors 0, ifcnt 68, reads 52911, writes 0, timeouts 0, DMA errors 0 Driver 4: position 0, standstill, SG min/max 0/0, read errors 0, write errors 0, ifcnt 68, reads 52912, writes 0, timeouts 0, DMA errors 0 Driver 5: position 0, assumed not present Driver 6: position 0, assumed not present Date/time: 2021-04-10 16:21:10 Cache data hit count 4294967295 Slowest loop: 2.50ms; fastest: 0.09ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 0.0MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 83, maxWait 0ms, bed compensation in use: none, comp offset 0.000 === MainDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1, chamberHeaters = -1 -1 Heater 1 is on, I-accum = 0.0 === GCodes === Segments left: 0 Movement lock held by null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === CAN === Messages queued 450532, send timeouts 0, received 600717, lost 0, longest wait 0ms for reply type 0, peak Tx sync delay 345, free buffers 17 (min 17) === SBC interface === State: 4, failed transfers: 0 Last transfer: 3ms ago RX/TX seq numbers: 30955/30955 SPI underruns 0, overruns 0 Number of disconnects: 0, IAP RAM available 0x118dc Buffer RX/TX: 0/0-0 === Duet Control Server === Duet Control Server v3.3-b2 Code buffer space: 4096 Configured SPI speed: 8000000 Hz Full transfers per second: 35.82 Codes per second: 0.00 Maximum length of RX/TX data transfers: 3148/796 4/10/2021, 8:21:05 AM Error: VSSA fault, check thermistor wiring
Omitted: several hundred similar VSSA fault messages before this one. It was still repeating the error when I checked just now, and I had to
M999
reset to get the web interface responsive enough to copy-paste! -
10/04/2021, 18:06:12 Error: VSSA fault, check thermistor wiring 10/04/2021, 18:06:08 Error: VSSA fault, check thermistor wiring 10/04/2021, 18:06:03 Error: VSSA fault, check thermistor wiring 10/04/2021, 18:05:59 Error: VSSA fault, check thermistor wiring 10/04/2021, 18:05:56 m122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.3beta2+1 (2021-04-09 14:40:21) running on Duet 3 Mini5plus Ethernet (SBC mode) Board ID: A45XG-F396U-D65J0-40KML-1F03Z-H03XP Used output buffers: 5 of 40 (36 max) === RTOS === Static ram: 100072 Dynamic ram: 92364 of which 0 recycled Never used RAM 48412, free system stack 112 words Tasks: Linux(resourceWait,119) HEAT(delaying,198) CanReceiv(notifyWait,774) CanSender(notifyWait,363) CanClock(delaying,340) TMC(notifyWait,99) MAIN(running,312) IDLE(ready,20) AIN(notifyWait,259) Owned mutexes: HTTP(MAIN) === Platform === Last reset 07:09:53 ago, cause: software Last software reset at 2021-04-10 00:26, reason: User, none spinning, available RAM 48444, slot 2 Software reset code 0x0012 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00000000 BFAR 0xe000ed38 SP 0x00000000 Task Linu Freestk 0 n/a Error status: 0x00 Aux0 errors 0,0,0 Aux1 errors 0,0,0 MCU revision 3, ADC conversions started 30627993, completed 30627992, timed out 0, errs 5 tnd=0 rnd=0 wrs=0 wrx=5 war=5 Supply voltage: min 0.0, current 35.9, max 35.9, under voltage events: 1, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/0, heap memory allocated/used/recyclable 2048/1130/1130, gc cycles 0 Driver 0: position 64902, standstill, SG min/max 0/510, read errors 0, write errors 1, ifcnt 62, reads 265, writes 41, timeouts 6924, DMA errors 0, failedOp 0x6a Driver 1: position -15887, standstill, SG min/max 0/510, read errors 0, write errors 1, ifcnt 62, reads 6828, writes 41, timeouts 361, DMA errors 0, failedOp 0x6f Driver 2: position 328, standstill, SG min/max 0/242, read errors 0, write errors 1, ifcnt 76, reads 5597, writes 47, timeouts 1586, DMA errors 0, failedOp 0x72 Driver 3: position 0, standstill, SG min/max 0/270, read errors 0, write errors 1, ifcnt 76, reads 6731, writes 47, timeouts 450, DMA errors 0, failedOp 0x41 Driver 4: position 0, standstill, SG min/max 0/274, read errors 0, write errors 1, ifcnt 76, reads 7183, writes 47, timeouts 0, DMA errors 0 Driver 5: position 0, assumed not present Driver 6: position 0, assumed not present Date/time: 2021-04-10 18:05:56 Cache data hit count 4294967295 Slowest loop: 1000.25ms; fastest: 0.08ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 0.0MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 83, maxWait 14959696ms, bed compensation in use: mesh, comp offset 0.000 === MainDDARing === Scheduled moves 1805, completed moves 1805, hiccups 0, stepErrors 0, LaErrors 0, Underruns [6, 0, 8], CDDA state -1 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1, chamberHeaters = -1 -1 === GCodes === Segments left: 0 Movement lock held by null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File* is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue* is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === CAN === Messages queued 248388, send timeouts 0, received 309852, lost 0, longest wait 2ms for reply type 6049, peak Tx sync delay 380, free buffers 17 (min 16) === SBC interface === State: 4, failed transfers: 0 Last transfer: 4ms ago RX/TX seq numbers: 9831/9831 SPI underruns 0, overruns 0 Number of disconnects: 0, IAP RAM available 0x118a8 Buffer RX/TX: 0/0-0 === Duet Control Server === Duet Control Server v3.2.2 Code buffer space: 4096 Configured SPI speed: 8000000 Hz Full transfers per second: 35.96 Maximum length of RX/TX data transfers: 3628/1248 10/04/2021, 18:05:54 Error: VSSA fault, check thermistor wiring 10/04/2021, 18:05:50 Error: VSSA fault, check thermistor wiring
Nothing important.....
Hope this helps @dc42