Reboots/crashes - RRF ≤3.5.0-rc1
-
New thread for this issue to make it easier to read up on, this will "only" be for RRF 3.5.0-rc1 onward.
This is a continuation of an issue I've had since RRF 3.5.0-b3+. But that thread is over 100 replies long and spands over several FW versions at this point, so it's quite difficult to read up on.
Hence the creation of this thread.A short summary of what been done pre RRF 3.5.0-rc1 to find out what causes and trying to mitigate the crashing:
-
"Special" RRF version with extra checks implimentet by dc42: didn't find/solve the issue.
-
Disabling everything being ran in
daemon.g
to see if it was causing the issue, turned out that wasn't the problem. -
Disabled the FTP protocol to see if that would do anything, it didn't.
-
Tried reducing DWC polling intervall & the loop delay in
daemon.g
to increase board load, didn't do anything. -
Hypothesis that the issues were caused by lower "quality" processor in the "Fysetch Big Dipper" board that caused the issues. So we tried a "hail mary" and reduced the SPI clock speed of transfers between the WiFi module and the main processor, which didn't help.
-
RRF 3.5.0-b4 was released so updated to that, but it didn't help.
-
The SD-card died, so i bought a new one and it seemed to help for a LITTLE while, but it was also a false positive.
-
Mid August 2023 I bought a genuine Duet 3 Mini 5+ (second hand), to see if that would solve the issues. But that board just lasted 24hours before all the drivers went MIA and i had to do a waranty return on it.
-
August went, and in early September the new waranty replacement board arrived so i got on it again. This time with RRF 3.5.0-rc1, and all seemed to be working (didn't have time to fiddle with the printer all through september so it got cought on the back burner).
-
Hypothesis that ESD buildup in extruder is whats causing the crashes, which turns out most likely not being the case since it's happning when the printer is idle (and has been idle for a prolonged time).
-
Figgures it's time to create this thread.
System info as of 29.10.2023:
Duet Mini 5+ v1.02, Duet 1LC v1.2 running in standalone mode.
- New crash while idle, 29th october 2023:
I noticed the printer crashing (while idle) when I was on the PC configuring the new Orca Slicer build (trying to see if it plays well with RRF).
Up until that point the printer haven't crashed in 3-something
(?)
days, and all that time I had made sure to ONLY open one instance of DWC at a time. And ONLY when i was checking in on the printer, uploading something etc., not having an open instance of DWC anywhere on any device.I've done this trying to rule out weither the crashing happens due to clients connecting/disconnecting to DWC / some sort of wifi/networking bug somewhere.
So with i pulled the debug log and saw that it's been ALOT of disconnects/connects from when i turned on the PC and opened Orca Slicer (it has a built in web-viewer) up until the crash:
debug log:
2023-10-29 10:48:12 [warn] HTTP client 192.168.10.x login succeeded (session key 1575580951) 2023-10-29 10:48:12 [warn] HTTP client 192.168.10.x login succeeded (session key 2217735406) 2023-10-29 10:51:46 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:51:49 [warn] HTTP client 192.168.10.x login succeeded (session key 118969356) 2023-10-29 10:52:24 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:52:24 [warn] HTTP client 192.168.10.x login succeeded (session key 1733578164) 2023-10-29 10:52:27 [warn] HTTP client 192.168.10.x login succeeded (session key 4202995206) 2023-10-29 10:52:43 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:52:46 [warn] HTTP client 192.168.10.x login succeeded (session key 2348116754) 2023-10-29 10:52:54 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:52:54 [warn] HTTP client 192.168.10.x login succeeded (session key 3315820619) 2023-10-29 10:52:59 [warn] HTTP client 192.168.10.x login succeeded (session key 1983721012) 2023-10-29 10:54:52 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:54:55 [warn] HTTP client 192.168.10.x login succeeded (session key 3546078201) 2023-10-29 10:55:01 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:55:04 [warn] HTTP client 192.168.10.x login succeeded (session key 3923159273) 2023-10-29 10:55:09 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:55:13 [warn] HTTP client 192.168.10.x login succeeded (session key 680819864) 2023-10-29 10:55:16 [warn] HTTP client 192.168.10.x login succeeded (session key 78281989) 2023-10-29 10:55:17 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:55:20 [warn] HTTP client 192.168.10.x login succeeded (session key 2376704510) 2023-10-29 10:55:35 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:55:38 [warn] HTTP client 192.168.10.x login succeeded (session key 2319723920) 2023-10-29 10:58:32 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:58:35 [warn] HTTP client 192.168.10.x login succeeded (session key 1512936919) 2023-10-29 10:59:04 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:59:07 [warn] HTTP client 192.168.10.x login succeeded (session key 2952561223) 2023-10-29 10:59:14 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 10:59:17 [warn] HTTP client 192.168.10.x login succeeded (session key 1500509856) 2023-10-29 11:00:29 [warn] HTTP client 192.168.10.x login succeeded (session key 2689501227) 2023-10-29 11:00:42 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 11:00:47 [warn] HTTP client 192.168.10.x login succeeded (session key 1957555567) 2023-10-29 11:00:57 [warn] HTTP client 192.168.10.x login succeeded (session key 1871523710) 2023-10-29 11:01:01 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 11:01:01 [warn] HTTP client 192.168.10.x login succeeded (session key 3629313409) 2023-10-29 11:01:04 [warn] HTTP client 192.168.10.x login succeeded (session key 1771555041) 2023-10-29 11:01:05 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 11:01:08 [warn] HTTP client 192.168.10.x login succeeded (session key 2508801350) 2023-10-29 11:01:12 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 11:01:15 [warn] HTTP client 192.168.10.x login succeeded (session key 94628185) 2023-10-29 11:01:15 [warn] HTTP client 192.168.10.x disconnected 2023-10-29 11:01:23 [warn] HTTP client 192.168.10.x login succeeded (session key 1759425591) power up + 00:00:03 [info] Event logging started at level debug power up + 00:00:03 [info] Running: Duet 3 Mini5plus WiFi: 3.5.0-rc.1 (2023-08-31 16:16:56) power up + 00:00:03 [info] Event logging stopped power up + 00:00:03 [info] Event logging started at level debug power up + 00:00:03 [info] Running: Duet 3 Mini5plus WiFi: 3.5.0-rc.1 (2023-08-31 16:16:56) power up + 00:00:03 [debug] Done! power up + 00:00:03 [debug] RepRapFirmware for Duet 3 Mini 5+ is up and running. power up + 00:00:04 [warn] WiFi module started power up + 00:00:06 [warn] Error: WiFi module reported: Failed to load credentials power up + 00:00:06 [warn] WiFi module is idle power up + 00:00:08 [warn] Error: WiFi module reported: Failed to load credentials power up + 00:00:08 [warn] WiFi module is idle power up + 00:00:12 [warn] WiFi module is connected to access point RV32-IOT2G, IP address 192.168.10.xx power up + 00:00:15 [warn] HTTP client 192.168.10.x login succeeded (session key 3688356955) 2023-10-29 10:07:24 [warn] Date and time set at power up + 00:00:15 2023-10-29 10:07:24 [warn] HTTP client 192.168.10.x login succeeded (session key 3699114206) 2023-10-29 10:26:40 [warn] HTTP client 192.168.10.x login succeeded (session key 0) 2023-10-29 10:26:41 [warn] HTTP client 192.168.10.x disconnected
M122:
=== Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1 (2023-08-31 16:16:56) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 9 of 40 (40 max) === RTOS === Static ram: 102836 Dynamic ram: 124036 of which 0 recycled Never used RAM 11688, free system stack 186 words Tasks: NETWORK(1,ready,17.7%,197) HEAT(3,nWait,0.0%,352) Move(4,nWait,0.0%,358) CanReceiv(6,nWait,0.0%,773) CanSender(5,nWait,0.0%,337) CanClock(7,delaying,0.0%,351) TMC(4,nWait,0.7%,108) MAIN(1,running,80.4%,704) IDLE(0,ready,0.3%,29) AIN(4,delaying,0.8%,264), total 100.0% Owned mutexes: === Platform === Last reset 00:23:43 ago, cause: software Last software reset at 2023-10-29 11:07, reason: HardFault invState, Gcodes spinning, available RAM 4516, slot 2 Software reset code 0x4063 HFSR 0x40000000 CFSR 0x00020000 ICSR 0x00487803 BFAR 0xe000ed38 SP 0x20011fa8 Task NETW Freestk 482 ok Stack: 000001ae 20032c0a 0000000a 00000000 20032c0a 0009dff9 00000000 600f0000 00000000 00000000 00000000 00000000 20031a2c 00000800 20034d50 2002bf30 20018668 2002bd9d 20018668 2001e880 0002fedf 00000000 00000000 00000000 20012058 00000014 b5dd8a35 Error status: 0x04 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 1423060, completed 1423060, timed out 0, errs 0 MCU temperature: min 34.0, current 34.7, max 37.3 Supply voltage: min 22.9, current 24.1, max 26.1, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/33, heap memory allocated/used/recyclable 2048/792/356, gc cycles 66 Events: 0 queued, 0 completed Driver 0: standstill, SG min 0, read errors 0, write errors 1, ifcnt 191, reads 9336, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 0, read errors 0, write errors 1, ifcnt 188, reads 9336, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 0, read errors 0, write errors 1, ifcnt 89, reads 9335, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 0, read errors 0, write errors 1, ifcnt 90, reads 9335, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 0, read errors 0, write errors 1, ifcnt 90, reads 9336, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-10-29 10:30:51 Cache data hit count 2342023185 Slowest loop: 216.46ms; fastest: 0.11ms === Storage === Free file entries: 16 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 6.8ms, write time 4.5ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 0 is on, I-accum = 0.3 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is doing "M190 S110" in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is doing "G4 S1" in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is doing "M190 S110" in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === Extruder 0 sensor: no filament === CAN === Messages queued 12818, received 29210, lost 0, boc 0 Longest wait 2ms for reply type 6031, peak Tx sync delay 248, free buffers 26 (min 25), ts 7116/7115/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 18.98ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 2 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 2.1beta4 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 43132 WiFi IP address 192.168.10.x Signal strength -50dBm, channel 6, mode 802.11n, reconnections 0 Clock register 00002001 Socket states: 0 0 0 0 0 0 0 0
When the crash happened i had one chrome tab open loaded with DWC other than the instance in Orca Slicer.
-
-
-
@Exerqtor Are you able to recreate this crash by having DWC running in Chrome and Orca slicer running at the same time (Ideally not doing anything with the slicer)?
-
@gloomyandy
Well since it's still intermittent, I can't really do much other than just leaving both open for the evening with the printer idle.And check back every now and then to see how it's playing out, and report back next time something happens.
Btw, aren't you a moderator Andy? In that case could you please lock the old thread so we don't have any more posts popping up in that one?
😅
-
-
@Exerqtor said in Reboots/crashes - RRF 3.5.0-rc1:
Supply voltage: min 22.9, current 24.1, max 26.1, under voltage events: 0, over voltage events: 0, power good: yes
Clutching at straws I know but if the machine was just sitting idle, why does the supply voltage vary so much? I don't see anything like that on my machine, even after multi-hour prints.
Might this indicate that the PSU is none too stable? Might it be that the cause of the issue is noisy supply voltage or some other issue related to the power supply which started coincidentally with a firmware change? Just a guess but maybe worth looking into?
-
@deckingman
I have no idea to be honest, but at least it's a genuine Meanwell UHP-200-24 for the 24v rail, the bed is mains powered through a SSR, and the Duet 5v rail is running of a genuine Meanwell RS-25-5.The mains voltage are on the higher side here, BUT it's stable @ 245v, and both the PSU's are rated to handle everything between 88 - 264v.
So it SHOULD be enough power on hand unless the 24v is wonky.
-
@Exerqtor Probably worth keeping an eye on though. I'd suggest running M122 shortly after powering the board on as that would clear any odd values that might occur as the psu turns on. If you then get strange min/max values after that I'd suggest you might want to take a closer look at the psu (even quality brands sometimes have problems).
Oh and sorry no I don't have any admin powers here!
-
@Exerqtor locked it for you
-
@gloomyandy
Yeah that M122 was ran a couple minutes after the printer crashed, so it's "fresh" in that regard😅
Of course, everything made will have faults
🤣
I've been watching the voltage now for a while and it's been stable at 24v all the time, so it's looks like it's only fluctuating like that at boot up.Okok, I must have mixed you up with one of the other regulars
🤣
Could you maybe lock the old thread @Phaedrux?Jay fixed it before i managed to post the reply!😇
-
@Exerqtor Are you checking the min/max values when observing it? I think you can see them by hovering over the voltage in DWC. They should capture any relatively short fluctuations.
-
@gloomyandy
Yeah that's the one i'm looking at, currently it's 24.1V and max/min = 23.5/24.5V, this is while standing idle. -
@Exerqtor Noticed the multiple disconnect reconnects. I was running into those this summer. To the point was not able to do anything on the paneldues as was clicking the acknowledgment trying to get past them. At times they were rapidly filling the screen. DWC was also knocked out by them.
Mine seemed to be caused by having a fancy mesh network for WiFi and a wireless Vtech landline phone sitting on the same workbench. Traded phones with the one in the back bedroom which was a different brand. That helped some. Then removed the mesh node in the workshop and installed a Ubiquiti Nanostation . The constant disconnect/reconnects stopped after that. The workshop is about 80 feet behind the room the router is in so I had mesh nodes in the storage building and the workshop to supply WiFi.
What was weird was it did not happen constantly but seemed to be worse in the afternoon. I thought heat in the workshop was the problem since it became worse in late June and started running the air conditioning when printing. ( which caused another whole set of issues. ) but that did not help.
Now things seem stable out there. 2 are on 3.4.6 and one is on 3.5.0-rc.1. All were affected by the WiFi problem. When it was dropping and reconnecting never checked if it was also resetting or rebooting.
Meanwells can fail. I had a SE-600-24 in a Ender 5 plus that would cause reboots when the heated bed was first started in cool weather. Removed it when I went to a line powered heated bed and installed a smaller ( and quieter ) RSP-150-24 meanwell. That power supply was larger than the original 500 watt that had failed so it should have been fine.
-
@Exerqtor said in Reboots/crashes - RRF 3.5.0-rc1:
@deckingman
I have no idea to be honest, but at least it's a genuine Meanwell UHP-200-24 for the 24v rail, the bed is mains powered through a SSR, and the Duet 5v rail is running of a genuine Meanwell RS-25-5.The mains voltage are on the higher side here, BUT it's stable @ 245v, and both the PSU's are rated to handle everything between 88 - 264v.
So it SHOULD be enough power on hand unless the 24v is wonky.
Yes I wasn't thinking about the capability of the PSU, just the fact that the M122 shows Vin varying from 22.9 to 26.1 when the machine is idle. Maybe there are some fast transient excursions that ought not to be there. Given everything else that you've tried, it might be worth looking into further.....
-
@KenW
I hear you, but if it turns out being the disconnects/connects causing the crashing issues, this is a issue that needs solving on the Duet side. NOT by changing out any of the other WiFi equipment (Ubiquity Unifi network, with a dedicated 2.5Ghz SSID).
The Duet is leaving the house before any of the AP's if we put it that way
😂😂😂😂
Regarding the voltage issues, I haven't got the feeling that's the cause since the voltages are still WAY within the Duets operating specs after all.
@deckingman
Yeah that's true, I just have a sneaking feeling this all mess has the route in something related to the WiFi module at this point.
Other than that it had another crash a coupe hours ago that I just saw.
I've been connecting to the printer with both my phone and laptop intermittently do check if anythings happening (hence the two other IP's)
debug log:
2023-10-29 12:40:42 [warn] HTTP client 192.168.10.x login succeeded (session key 451652038) 2023-10-29 12:59:12 [warn] HTTP client 192.168.10.2xx login succeeded (session key 947195611) 2023-10-29 13:47:02 [warn] HTTP client 192.168.10.xxx login succeeded (session key 834710639) 2023-10-29 13:51:46 [warn] HTTP client 192.168.10.xxx disconnected 2023-10-29 14:00:05 [warn] HTTP client 192.168.10.xxx login succeeded (session key 2059405738) 2023-10-29 14:00:55 [warn] HTTP client 192.168.10.xxx disconnected 2023-10-29 16:39:10 [warn] HTTP client 192.168.10.xxx login succeeded (session key 3826474446) 2023-10-29 17:28:47 [warn] HTTP client 192.168.10.xxx login succeeded (session key 2170746625) 2023-10-29 18:20:29 [warn] HTTP client 192.168.10.xxx login succeeded (session key 3718996851) power up + 00:00:03 [info] Event logging started at level debug power up + 00:00:03 [info] Running: Duet 3 Mini5plus WiFi: 3.5.0-rc.1 (2023-08-31 16:16:56) power up + 00:00:03 [info] Event logging stopped power up + 00:00:03 [info] Event logging started at level debug power up + 00:00:03 [info] Running: Duet 3 Mini5plus WiFi: 3.5.0-rc.1 (2023-08-31 16:16:56) power up + 00:00:03 [debug] Done! power up + 00:00:03 [debug] RepRapFirmware for Duet 3 Mini 5+ is up and running. power up + 00:00:04 [warn] WiFi module started power up + 00:00:08 [warn] WiFi module is connected to access point RV32-IOT2G, IP address 192.168.10.xxxx power up + 00:00:08 [warn] HTTP client 192.168.10.xxx login succeeded (session key 3944043149) 2023-10-29 18:28:38 [warn] Date and time set at power up + 00:00:08
x = desktop
xx = phone
xxx = laptop
xxxx = printers IP.
M122:
=== Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1 (2023-08-31 16:16:56) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 1 of 40 (29 max) === RTOS === Static ram: 102836 Dynamic ram: 124036 of which 0 recycled Never used RAM 11688, free system stack 182 words Tasks: NETWORK(1,nWait,16.3%,211) HEAT(3,nWait,0.0%,352) Move(4,nWait,0.0%,358) CanReceiv(6,nWait,0.0%,773) CanSender(5,nWait,0.0%,337) CanClock(7,delaying,0.0%,351) TMC(4,nWait,0.7%,108) MAIN(1,running,81.9%,704) IDLE(0,ready,0.3%,29) AIN(4,delaying,0.8%,264), total 100.0% Owned mutexes: WiFi(NETWORK) === Platform === Last reset 01:24:28 ago, cause: software Last software reset at 2023-10-29 18:28, reason: HardFault invState, Gcodes spinning, available RAM 5200, slot 0 Software reset code 0x4063 HFSR 0x40000000 CFSR 0x00020000 ICSR 0x00000803 BFAR 0xe000ed38 SP 0x20011fa8 Task NETW Freestk 482 ok Stack: 000001b0 00000002 200014ec 00000000 ffffffff 0009df2d 00000000 600f0000 00000000 00000000 00000000 00000000 200301d4 00000800 20035710 2002bf00 20018668 2002bd9d 20018668 2001e880 0002fedf 00000000 00000000 00000000 20012058 00000014 00000000 Error status: 0x00 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 5068453, completed 5068451, timed out 0, errs 0 MCU temperature: min 34.4, current 35.1, max 37.8 Supply voltage: min 23.1, current 24.1, max 26.0, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/33, heap memory allocated/used/recyclable 2048/564/128, gc cycles 236 Events: 0 queued, 0 completed Driver 0: standstill, SG min 68, read errors 0, write errors 1, ifcnt 235, reads 4587, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 68, read errors 0, write errors 1, ifcnt 231, reads 4587, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 0, read errors 0, write errors 1, ifcnt 115, reads 4586, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 0, read errors 0, write errors 1, ifcnt 116, reads 4586, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 0, read errors 0, write errors 1, ifcnt 117, reads 4587, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-10-29 19:52:58 Cache data hit count 4294967295 Slowest loop: 12.68ms; fastest: 0.13ms === Storage === Free file entries: 18 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 6.4ms, write time 7.1ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is doing "G4 S1" in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === Extruder 0 sensor: no filament === CAN === Messages queued 45625, received 103943, lost 0, boc 0 Longest wait 2ms for reply type 6031, peak Tx sync delay 336, free buffers 26 (min 25), ts 25342/25341/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 24.73ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 1 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 2.1beta4 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 42972 WiFi IP address 192.168.10.xxxx Signal strength -49dBm, channel 6, mode 802.11n, reconnections 0 Clock register 00002001 Socket states: 0 0 0 0 0 0 0 0
-
@Exerqtor If the problem is some sort of short transient from the power supply that could easily not be captured by the voltage monitor.
It might be better to test using the same setup you had when things crashed, so Orca Slicer plus Chrome? Especially if Orca slicer is doing something that is different to the normal connection pattern generated by DWC.
-
Dumb question but where is this debug.log that you are referring to?
-
@DaveA said in Reboots/crashes - RRF 3.5.0-rc1:
Dumb question but where is this debug.log that you are referring to?
It has to be configured through M929.
-
@Exerqtor thanks for creating this new thread, it makes this issue easier to track.
Please continue to post M122 reports whenever this happens, in case I can establish some common data in the stack trace.
Please install the firmware at https://www.dropbox.com/scl/fo/tjznycpk7bv7sj71p0ssl/h?rlkey=096p4nvgmigyrb20jj8olg3wu&dl=0. The only issue it fixes that might cause this type of crash is related to filament monitors and (I think) tool changes, so I doubt that it will fix the issue you are experiencing.
Re DWC disconnects/reconnects, do you have a PanelDue connected? See https://forum.duet3d.com/topic/33538/duet-2-ethernet-disconnect-when-printing-and-paneldue-connected.
-
@Exerqtor out of curiosity since you did not mention it (I have to admit I did not read through that huge other thread): The fact that - if I understand you correctly - you say the Mini worked first but started to show issues after a while makes me wonder a bit.
Therefore, just to have asked and based on my rusty electronics knowledge - did you check the following two things:
- a faulty power supply might produce spikes that will probably never pop up on the voltage display of the duet but might cause issues or even slowly kill your controller (evil to find out and as far as I can tell the Duet is well hardened against such things, but there is always a limit for that). I would not know how to find this that except with an oscilloscope however, or by blindly replacing the power supply with another one that is known to work.
- obscure emf / crosstalk issues (which I would not rule out on a new printer) might be found or at least narrowed down if you run the Duet with literally everything unplugged. Do these reboots happen even then? If yes, replace the power supply next. If not, start to plug in things back in one item at a time, wait for the issue to (not) happen, turn off the Duet, plug the next in (or unplug and note the possible culprit if it happens - but do not stop there), boot the Duet etc. If you have located all connectors that seem to cause the issue, remove all the others, plug the first of those in and repeat the process. If you are lucky, the issue happens at once, if not, you need go on until you found its counterpart(s)... and so on.
It is an unnervingly lengthy process since you will need to wait for an extended period of time between each attempt, but since you had to fight with that issue for so long now... maybe that process helps. Well, unless you already tried that...
-
@dc42 said in Reboots/crashes - RRF 3.5.0-rc1:
@Exerqtor thanks for creating this new thread, it makes this issue easier to track.
Please continue to post M122 reports whenever this happens, in case I can establish some common data in the stack trace.
Will do, haven't been another reboot since the last one i posted. But the desktop has been in hibernation without any connections to DWC. But i will continue posting reports until a solution is found
🤣
Please install the firmware at https://www.dropbox.com/scl/fo/tjznycpk7bv7sj71p0ssl/h?rlkey=096p4nvgmigyrb20jj8olg3wu&dl=0. The only issue it fixes that might cause this type of crash is related to filament monitors and (I think) tool changes, so I doubt that it will fix the issue you are experiencing.
Installing that now, thanks!
Re DWC disconnects/reconnects, do you have a PanelDue connected? See https://forum.duet3d.com/topic/33538/duet-2-ethernet-disconnect-when-printing-and-paneldue-connected.
Yup i have a PanelDue connected as well, i'll update the system description to inlcude that as i completely forgot to mention it.
@NeoDue said in Reboots/crashes - RRF 3.5.0-rc1:
@Exerqtor out of curiosity since you did not mention it (I have to admit I did not read through that huge other thread): The fact that - if I understand you correctly - you say the Mini worked first but started to show issues after a while makes me wonder a bit.
Therefore, just to have asked and based on my rusty electronics knowledge - did you check the following two things:
- a faulty power supply might produce spikes that will probably never pop up on the voltage display of the duet but might cause issues or even slowly kill your controller (evil to find out and as far as I can tell the Duet is well hardened against such things, but there is always a limit for that). I would not know how to find this that except with an oscilloscope however, or by blindly replacing the power supply with another one that is known to work.
To emboss that statement i made about the Mini seeming to be working a little bit:
- When i hooked it up and did the initial tests i didn't have the desktop computer or any other clients connected through DWC either. I just did the PID tuning etc. from the PanelDue and ran a quick test cube to see that all actually worked. Then the printer stood idle witout no use or anything until I had time to work on it in early october, when i quickly
- obscure emf / crosstalk issues (which I would not rule out on a new printer) might be found or at least narrowed down if you run the Duet with literally everything unplugged. Do these reboots happen even then? If yes, replace the power supply next. If not, start to plug in things back in one item at a time, wait for the issue to (not) happen, turn off the Duet, plug the next in (or unplug and note the possible culprit if it happens - but do not stop there), boot the Duet etc. If you have located all connectors that seem to cause the issue, remove all the others, plug the first of those in and repeat the process. If you are lucky, the issue happens at once, if not, you need go on until you found its counterpart(s)... and so on.
It is an unnervingly lengthy process since you will need to wait for an extended period of time between each attempt, but since you had to fight with that issue for so long now... maybe that process helps. Well, unless you already tried that...
saw that the issues wasn't solved at all.
I'll try to see how it behaves over the weekend, and if it's still acting up (which i assume it will) I'll try running it "barebones" for a day or two and see if anything happens.
Not saying a faulty PSU or EMF ain't the issue, it's just not what my gut is telling me at this point.
-
And a new crash, also while idle, but being active on the desktop with an instance of DWC open through OrcaSlicer.
This is the first crash in 4 days, and the only difference (other than FW updates that shouldn't effect this anyways), is that I've been active on the desktop with an DWC instance connected.
Don't know if this is just me seeing paterns in the clouds lol, but i'm sure starting to see a clear pattern in when it crash or not.
This is with RRF 3.5.0rc1+ on the Mini / 1LC, and 3.5.0rc7 on the panel due.
degbug:
power up + 00:00:03 [info] Event logging started at level debug power up + 00:00:03 [info] Running: Duet 3 Mini5plus WiFi: 3.5.0-rc.1+ (2023-11-01 10:29:03) power up + 00:00:03 [info] Event logging stopped power up + 00:00:03 [info] Event logging started at level debug power up + 00:00:03 [info] Running: Duet 3 Mini5plus WiFi: 3.5.0-rc.1+ (2023-11-01 10:29:03) power up + 00:00:03 [debug] Done! power up + 00:00:03 [debug] RepRapFirmware for Duet 3 Mini 5+ is up and running. power up + 00:00:04 [warn] Error: Heater 1 fault: failed to read sensor: notReady power up + 00:00:04 [info] M291: - Event notification - Heater 1 fault: failed to read sensor: notReady power up + 00:00:04 [debug] - Event notification - power up + 00:00:04 [debug] Heater 1 fault: failed to read sensor: notReady power up + 00:00:08 [warn] WiFi module is connected to access point RV32-IOT2G, IP address 192.168.10.x power up + 00:00:08 [warn] HTTP client 192.168.10.xx login succeeded (session key 1133366275) 2023-11-03 23:29:57 [warn] Date and time set at power up + 00:00:08
M122:
=== Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1+ (2023-11-01 10:29:03) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 9 of 40 (40 max) === RTOS === Static ram: 102812 Dynamic ram: 123924 of which 0 recycled Never used RAM 11824, free system stack 186 words Tasks: NETWORK(2,nWait,18.3%,209) HEAT(3,nWait,0.0%,354) Move(4,nWait,0.0%,358) CanReceiv(6,nWait,0.0%,797) CanSender(5,nWait,0.0%,336) CanClock(7,delaying,0.0%,350) TMC(4,nWait,0.7%,108) MAIN(1,running,79.8%,670) IDLE(0,ready,0.3%,29) AIN(4,delaying,0.8%,264), total 100.0% Owned mutexes: === Platform === Last reset 00:13:05 ago, cause: software Last software reset at 2023-11-03 23:29, reason: HardFault invState, Gcodes spinning, available RAM 10940, slot 2 Software reset code 0x4063 HFSR 0x40000000 CFSR 0x00020000 ICSR 0x00000803 BFAR 0xe000ed38 SP 0x20011f88 Task NETW Freestk 482 ok Stack: 000001b0 00000002 200014e8 00000000 ffffffff 0009e9cd 00000000 600f0000 00000000 00000000 00000000 00000000 20031c4c 00000800 2002c0e0 2002c0e0 00000001 2002bf7d 20018658 2001e868 0002ff97 00000000 00000000 00000000 20012038 00000014 b5dd8a35 Error status: 0x04 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 785822, completed 785822, timed out 0, errs 0 MCU temperature: min 37.0, current 37.4, max 40.1 Supply voltage: min 22.5, current 24.1, max 26.5, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/33, heap memory allocated/used/recyclable 2048/1856/1420, gc cycles 36 Events: 1 queued, 1 completed Driver 0: standstill, SG min 16, read errors 0, write errors 1, ifcnt 31, reads 41299, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 246, read errors 0, write errors 1, ifcnt 27, reads 41299, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 2, read errors 0, write errors 1, ifcnt 161, reads 41299, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 2, read errors 0, write errors 1, ifcnt 162, reads 41298, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 2, read errors 0, write errors 1, ifcnt 164, reads 41299, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-11-03 23:42:53 Cache data hit count 1378843841 Slowest loop: 13.16ms; fastest: 0.13ms === Storage === Free file entries: 18 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 6.9ms, write time 4.6ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is doing "G4 S1" in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === check 0 clear 0 Extruder 0 sensor: no filament === CAN === Messages queued 7076, received 16136, lost 0, errs 1, boc 0 Longest wait 2ms for reply type 6031, peak Tx sync delay 231, free buffers 26 (min 25), ts 3926/3925/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 8.42ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 3 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 2.1beta4 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 39880 WiFi IP address 192.168.10.x Signal strength -50dBm, channel 6, mode 802.11n, reconnections 0 Clock register 00002001 Socket states: 0 0 0 0 0 0 0 0
A new development this time is that the hotend heater reports fault and reads 2000C after the crash.
After when restarting it manually (with
M999
) it reports back as usual (active / 25C).