Reboots/crashes - RRF ≤3.5.0-rc1
-
@Exerqtor I have a 3 6HC here. Thus, if my assumption from yesterday is correct this applies to that board as well. But I guess it might be heavily dependent on how Chromium works on the corresponding computer, including anything that might be affecting this, such as e.g. how the computer handles standby states.
I will only be able to test my assumption in about a week since I am not at home, but I will follow this thread and try to recreate that issue both on my Duet 3 6HC as well as on my old Duet 2 then unless you already found a solution by then - you two are working really fast at the moment
-
@dc42 said in Reboots/crashes - RRF 3.5.0-rc1:
@Exerqtor
when I say <CR> I mean the control character carriage return. Wireshark displays it as \r so OrcaSlicer does indeed send that sequence at the end of the Accept-Language header.Do you get any crashes if you connect Chrome to the Duet but not OrcaSlicer?
The IP address is now removed from Orca Slicer, and I've got two DWC instances opened in chrome.
So lets play the waiting game and see what happens.
-
@dc42 said in Reboots/crashes - RRF 3.5.0-rc1:
@Exerqtor
when I say <CR> I mean the control character carriage return. Wireshark displays it as \r so OrcaSlicer does indeed send that sequence at the end of the Accept-Language header.Do you get any crashes if you connect Chrome to the Duet but not OrcaSlicer?
It's not Orca Slicer or "just" Orca Slicer thats causing the crashes, it happened again with it closed/not connected:
M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1++wdb (2023-12-11 09:30:48) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 2 of 40 (39 max) === RTOS === Static ram: 102844 Dynamic ram: 123224 of which 208 recycled Never used RAM 12284, free system stack 180 words Tasks: NETWORK(2,nWait,193.2%,215) HEAT(3,nWait,0.2%,327) Move(4,nWait,0.0%,352) CanReceiv(6,nWait,0.4%,774) CanSender(5,nWait,0.0%,336) CanClock(7,delaying,0.1%,350) TMC(4,nWait,6.4%,108) MAIN(1,running,250.4%,670) IDLE(0,ready,42.9%,29) AIN(4,delaying,7.6%,264), total 501.1% Owned mutexes: WiFi(NETWORK) === Platform === Last reset 03:34:40 ago, cause: software Last software reset at 2023-12-13 02:36, reason: AssertionFailed, Gcodes spinning, available RAM 11124, slot 1 Software reset code 0x4123 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00000000 BFAR 0xe000ed38 SP 0x20011fbc Task NETW Freestk 495 ok Stack: 00000919 000af3d0 0002de6f 2002bed4 2002be01 000001af 2002c100 20031458 2002c118 2001e888 0d0a0d31 a5a5a5a5 0d312e30 00000000 00000000 00000000 20031464 00000800 20035970 2002c100 20018678 2002bf9d 20018678 2001e888 0003010f 00000000 00000000 Error status: 0x00 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 12881155, completed 12881155, timed out 0, errs 0 MCU temperature: min 33.6, current 33.8, max 37.3 Supply voltage: min 21.9, current 24.0, max 27.2, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/27, heap memory allocated/used/recyclable 2048/708/356, gc cycles 26104 Events: 0 queued, 0 completed Driver 0: standstill, SG min 2, read errors 0, write errors 1, ifcnt 50, reads 22552, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 16, read errors 0, write errors 1, ifcnt 48, reads 22552, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 2, read errors 0, write errors 1, ifcnt 176, reads 22552, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 0, read errors 0, write errors 1, ifcnt 178, reads 22551, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 2, read errors 0, write errors 1, ifcnt 172, reads 22552, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-12-13 06:11:01 Cache data hit count 4294967295 Slowest loop: 12.69ms; fastest: 0.13ms === Storage === Free file entries: 18 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 4.9ms, write time 4.5ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, max steps late 0, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is doing "G4 P10" in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === in 0 notIn 0 Extruder 0 sensor: no filament === CAN === Messages queued 115938, received 264103, lost 0, errs 1, boc 0 Longest wait 2ms for reply type 6053, peak Tx sync delay 276, free buffers 26 (min 25), ts 64405/64404/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 6.29ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 3 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 2.1beta4 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 39692 WiFi IP address 192.168.10.x Signal strength -47dBm, channel 6, mode 802.11n, reconnections 0 Clock register 00002001 Socket states: 0 0 0 0 0 0 0 0
Doing the same with Orca Slicer now, no chrome connections and two Orca Slicer.
-
@Exerqtor may I ask about the computer you ran Chrome on in that test? Which OS? Notebook or Desktop? Screensaver, sleep or hibernation / energy saving mode enabled or disabled? Unless I overlook something, one of these things probably made the difference here.
-
@dc42
Hi david, I'm reading this thread as bystander. I asked myself if it is possible that the error occurs when the connection is interrupted but the socket is not closed and RRF is running in an timeout?Maybe I'm totally off, but as far as I know the RRF is opening an DMA channel to the WiFi/Network stack and the socket handling is done over there.
What will happen if a client is sending a partially delayed http header like
GET /rr_model?flags=d99fno HTTP/1.1\r\n Host: 192.168.10.x\r\n Connection: keep-alive\r\n X-Session-Key: 4102879531\r\n User-Agent: BBL-Slicer/v01.07.07.89 (dark) Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.52\r\n Accept: */*\r\n Referer: http://192.168.10.x/\r\n <VERY LONG DELAY> Accept-Encoding: gzip, deflate\r\n Accept-Language: nb,no;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6\r\n \r\n
The socket will stay open and later on run into an timeout, but the DMA channel is still open and partially written. The buffer gets freed (on stack), but the client or WiFi/Network stack keeps writing to that address via the opened DMA after the <very long delay>.
Maybe the idea is totally off, then just ignore it.
I was just thinking about another behaviour I observed when using DWC over an VPN tunnel, sometimes the RRF just stopped after I checked the status of the print and disconnected the VPN.
Edit:
- No timeout in the Network stack
- Timeout in the RRF
- Open DMA channel
-
@NeoDue said in Reboots/crashes - RRF 3.5.0-rc1:
@Exerqtor may I ask about the computer you ran Chrome on in that test? Which OS? Notebook or Desktop? Screensaver, sleep or hibernation / energy saving mode enabled or disabled? Unless I overlook something, one of these things probably made the difference here.
It's on a desktop running windows 10 pro x64, hibernation/powersaving has been turned off since i'm running these tests.
So it's eiter chrome thats making the issues OR both. That we will see if it crashes with only Orca Slicer running (as it is now).
@dc42 So i just came home, and it's been no crashes since 05:50 this morning at least, this is with only Orca Slicer open. Gonna let it stay idle like this until tomorrow morning and see what happens. But i got a sneaking feeling Orca Slicer (by itself at least) ain't the cause of the crashes.
-
@Exerqtor have you ever seen this issue when running DuetWiFiServer 1.27? If you haven't, or you are not sure, please try it.
-
@dc42 Not sure, but i'm pretty sure it came after moving from 1.27. I'll revert to it now, shouldn't be an issue to run 1.27 with 3.5x on everything else?
-
@Exerqtor you can use 1.27 with 3.5x
-
Ok, with that it's now on WiFiServer 1.27, and i'm back to having two tabs with chrome and one Orca Slicer instance (since that seem to be what made it crash most often). #waitinggame
-
@T3P3Tony & @dc42 Just had a crash on WiFiServer 1. 27 as well:
M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1++wdb (2023-12-11 09:30:48) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 1 of 40 (40 max) === RTOS === Static ram: 102844 Dynamic ram: 123224 of which 0 recycled Never used RAM 12492, free system stack 182 words Tasks: NETWORK(1,ready,23.6%,219) HEAT(3,nWait,0.0%,352) Move(4,nWait,0.0%,358) CanReceiv(6,nWait,0.0%,797) CanSender(5,nWait,0.0%,336) CanClock(7,delaying,0.0%,350) TMC(4,nWait,0.7%,108) MAIN(1,running,69.2%,670) IDLE(0,ready,5.6%,29) AIN(4,delaying,0.8%,264), total 100.0% Owned mutexes: WiFi(NETWORK) === Platform === Last reset 00:28:00 ago, cause: software Last software reset at 2023-12-13 19:25, reason: AssertionFailed, Gcodes spinning, available RAM 10924, slot 2 Software reset code 0x4123 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00000000 BFAR 0xe000ed38 SP 0x20011fbc Task NETW Freestk 495 ok Stack: 00000919 000af3d0 0002de6f 2002bed4 2002be01 000001af 2002c100 20030c40 2002c118 2001e888 a5a5a5a5 0d0a0d36 0d312e30 00000000 00000000 00000000 20030c4c 00000800 20035970 2002c100 20018678 2002bf9d 20018678 2001e888 0003010f 00000000 00000000 Error status: 0x04 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 1680921, completed 1680920, timed out 0, errs 0 MCU temperature: min 35.5, current 36.2, max 39.1 Supply voltage: min 11.5, current 24.1, max 24.2, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/27, heap memory allocated/used/recyclable 2048/404/52, gc cycles 3230 Events: 0 queued, 0 completed Driver 0: standstill, SG min 0, read errors 0, write errors 1, ifcnt 97, reads 22904, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 2, read errors 0, write errors 1, ifcnt 95, reads 22904, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 0, read errors 0, write errors 1, ifcnt 200, reads 22904, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 2, read errors 0, write errors 1, ifcnt 203, reads 22903, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 0, read errors 0, write errors 1, ifcnt 197, reads 22904, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-12-13 19:53:05 Cache data hit count 2936262464 Slowest loop: 10.26ms; fastest: 0.13ms === Storage === Free file entries: 18 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 7.6ms, write time 4.6ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, max steps late 0, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is doing "G4 P10" in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === in 0 notIn 0 Extruder 0 sensor: no filament === CAN === Messages queued 15138, received 34498, lost 0, errs 0, boc 0 Longest wait 3ms for reply type 6053, peak Tx sync delay 265, free buffers 26 (min 25), ts 8405/8404/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 13.65ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 4 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 1.27 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 25464 WiFi IP address 192.168.10.x Signal strength -55dBm, channel 0, mode 802.11n, reconnections 0 Clock register 00002002 Socket states: 0 0 0 0 0 0 0 0
-
@Exerqtor thanks! Let me add my observations in detail, maybe they do help someone here - I honestly don't know what happens in detail when the browser is open on a "sleeping" computer:
- Desktop PC running MX Linux or Linux Mint with Vivaldi open and no kind of powersaving except turning the monitor off active: dwc tabs do not crash, Duet does not crash.
- Notebook (in my case an old MacBook Air) running Windows 10 x64 with the same Vivaldi version active, hibernation turned off, but higher sleep modes active: tabs with DWC frequently do not recover if the computer is woken from sleep. Duet crashes occur (on a Duet 3 6HC running RRF 3.5.0 beta 4 or 3.5.0rc1), but not every time the browser tab crashes. Several browser tab crashes seem to increase the risk for the Duet itself crashing however, as far as I remember.
I still have my old printer here which runs on a Duet 2 Wifi with RRF 3.3 or an early 3.4 version. If it helps anyone, I could try to recreate those crashes in the new board and then check if the old one might be affected as well.
-
@NeoDue Interested in seeing if you're also able to recreate this on your setup with the same results!
Otherwise it's another crash over the night, same test criteria as the last report:
M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1++wdb (2023-12-11 09:30:48) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 8 of 40 (40 max) === RTOS === Static ram: 102844 Dynamic ram: 122040 of which 0 recycled Never used RAM 13676, free system stack 182 words Tasks: NETWORK(2,nWait,410.5%,236) HEAT(3,nWait,0.4%,327) Move(4,nWait,0.0%,356) CanReceiv(6,nWait,0.9%,797) CanSender(5,nWait,0.0%,336) CanClock(7,delaying,0.1%,350) TMC(4,nWait,12.4%,108) MAIN(1,running,1216.0%,670) IDLE(0,ready,97.1%,29) AIN(4,delaying,14.7%,264), total 1752.3% Owned mutexes: === Platform === Last reset 01:41:13 ago, cause: software Last software reset at 2023-12-14 04:08, reason: AssertionFailed, Gcodes spinning, available RAM 12492, slot 1 Software reset code 0x4123 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00000000 BFAR 0xe000ed38 SP 0x20011fbc Task NETW Freestk 495 ok Stack: 00000919 000af3d0 0002de6f 2002bed4 2002be01 000001ae 2002c100 20030c58 2002c118 2001e888 a5a5a5a5 a5a5a5a5 0a0d312e 00000000 00000000 00000000 20030c64 00000800 20035988 2002c100 20018678 2002bf9d 20018678 2001e888 0003010f 00000000 00000000 Error status: 0x04 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 6073286, completed 6073286, timed out 0, errs 0 MCU temperature: min 34.7, current 35.3, max 38.4 Supply voltage: min 21.7, current 24.0, max 27.4, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/23, heap memory allocated/used/recyclable 2048/1496/1184, gc cycles 11249 Events: 0 queued, 0 completed Driver 0: standstill, SG min 16, read errors 0, write errors 1, ifcnt 135, reads 57475, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 208, read errors 0, write errors 1, ifcnt 133, reads 57475, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 2, read errors 0, write errors 1, ifcnt 233, reads 57475, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 0, read errors 0, write errors 1, ifcnt 235, reads 57474, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 2, read errors 0, write errors 1, ifcnt 229, reads 57475, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-12-14 05:49:51 Cache data hit count 4294967295 Slowest loop: 14.59ms; fastest: 0.13ms === Storage === Free file entries: 18 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 7.7ms, write time 4.4ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, max steps late 0, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === in 0 notIn 0 Extruder 0 sensor: no filament === CAN === Messages queued 54669, received 124545, lost 0, errs 1, boc 0 Longest wait 2ms for reply type 6031, peak Tx sync delay 271, free buffers 26 (min 25), ts 30367/30366/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 12.60ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 4 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 1.27 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 22544 WiFi IP address 192.168.10.x Signal strength -56dBm, channel 0, mode 802.11n, reconnections 0 Clock register 00002002 Socket states: 0 0 0 0 0 0 0 0
-
@NeoDue thanks for this information. We need the M122 report(s) for when the Duet crashes. The DWC not recovering from sleep modes etc may be related but may also be a different issue so lets leave that aside for now and focus on what happens when RRF crashes.
-
-
@NeoDue It does not hurt to grab a M122 before you start but nothing specifically in advance.
-
Just got back home from work, to another crash on 1.27.
Besides crashing (seemingly just as often as the 2.4 betas), DWC loads sooooo damned slow with 1.27!
Anywho, another report:
M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1++wdb (2023-12-11 09:30:48) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 14 of 40 (40 max) === RTOS === Static ram: 102844 Dynamic ram: 122016 of which 0 recycled Never used RAM 13700, free system stack 182 words Tasks: NETWORK(2,nWait,23.4%,211) HEAT(3,nWait,0.0%,352) Move(4,nWait,0.0%,358) CanReceiv(6,nWait,0.0%,797) CanSender(5,nWait,0.0%,336) CanClock(7,delaying,0.0%,350) TMC(4,nWait,0.7%,108) MAIN(1,running,69.4%,670) IDLE(0,ready,5.6%,29) AIN(4,delaying,0.8%,264), total 100.0% Owned mutexes: WiFi(NETWORK) === Platform === Last reset 00:18:49 ago, cause: software Last software reset at 2023-12-14 15:39, reason: AssertionFailed, Gcodes spinning, available RAM 13676, slot 1 Software reset code 0x4123 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00489000 BFAR 0xe000ed38 SP 0x20011fbc Task NETW Freestk 495 ok Stack: 00000919 000af3d0 0002de6f 2002bed4 2002be01 000001af 2002c168 20031ca0 2002c180 2001e888 a5a5a5a5 0d0a0d36 0d312e30 00000000 00000000 00000000 20031cac 00000800 200359a0 2002c168 20018678 2002bf9d 20018678 2001e888 0003010f 00000000 00000000 Error status: 0x04 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 1129622, completed 1129621, timed out 0, errs 0 MCU temperature: min 34.8, current 35.3, max 38.2 Supply voltage: min 21.7, current 24.1, max 27.4, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/22, heap memory allocated/used/recyclable 2048/1604/1304, gc cycles 2081 Events: 0 queued, 0 completed Driver 0: standstill, SG min 16, read errors 0, write errors 1, ifcnt 177, reads 59429, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 208, read errors 0, write errors 1, ifcnt 175, reads 59429, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 2, read errors 0, write errors 1, ifcnt 19, reads 59429, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 0, read errors 0, write errors 1, ifcnt 21, reads 59428, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 2, read errors 0, write errors 1, ifcnt 15, reads 59429, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-12-14 15:57:58 Cache data hit count 1957951974 Slowest loop: 13.50ms; fastest: 0.13ms === Storage === Free file entries: 18 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 7.8ms, write time 4.0ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, max steps late 0, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === in 0 notIn 0 Extruder 0 sensor: no filament === CAN === Messages queued 10177, received 23198, lost 0, errs 1, boc 0 Longest wait 2ms for reply type 6031, peak Tx sync delay 261, free buffers 26 (min 25), ts 5649/5648/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 12.58ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 4 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 1.27 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 25624 WiFi IP address 192.168.10.x Signal strength -54dBm, channel 0, mode 802.11n, reconnections 0 Clock register 00002002 Socket states: 0 0 0 0 0 0 0 0
-
@dc42
I was looking at the code and found your ocmment about the failed assert.// Look at the response #if SAME5x //TEMP DEBUG CheckStackValue(9, ra); //***** This is the check that occasionally fails ****** #endif
looking backwards from here it will reset the spi and disable the dma channel. There is a comment about the disabling of the dma channel on SAME5x.
// Disable a channel. Also clears its status and disables its interrupts. // On the SAME5x it is sometimes impossible to disable a channel. So we now return true if disabling it succeeded, false it it is still enabled. bool DmacManager::DisableChannel(const uint8_t channel) noexcept {
but the return value is not checked in the function:
static inline void spi_rx_dma_disable() noexcept
From here I've looked in the datasheet of the chip to check the dma function and found the following:
So it is possible that a DMA transfer is already scheduled but not processed and this will block/prevent the gracefull disable from taking place.
-
@timschneider said in Reboots/crashes - RRF 3.5.0-rc1:
@dc42
I was looking at the code and found your ocmment about the failed assert.// Look at the response #if SAME5x //TEMP DEBUG CheckStackValue(9, ra); //***** This is the check that occasionally fails ****** #endif
looking backwards from here it will reset the spi and disable the dma channel. There is a comment about the disabling of the dma channel on SAME5x.
// Disable a channel. Also clears its status and disables its interrupts. // On the SAME5x it is sometimes impossible to disable a channel. So we now return true if disabling it succeeded, false it it is still enabled. bool DmacManager::DisableChannel(const uint8_t channel) noexcept {
but the return value is not checked in the function:
static inline void spi_rx_dma_disable() noexcept
From here I've looked in the datasheet of the chip to check the dma function and found the following:
So it is possible that a DMA transfer is already scheduled but not processed and this will block/prevent the gracefull disable from taking place.
Woah, great catch! Hope you're on to something there!
And some more data points:
M122 === Diagnostics === RepRapFirmware for Duet 3 Mini 5+ version 3.5.0-rc.1++wdb (2023-12-11 09:30:48) running on Duet 3 Mini5plus WiFi (standalone mode) Board ID: XNHXF-HR6KL-K65J0-409N2-K9W1Z-RV2MZ Used output buffers: 1 of 40 (40 max) === RTOS === Static ram: 102844 Dynamic ram: 122016 of which 0 recycled Never used RAM 13700, free system stack 186 words Tasks: NETWORK(1,ready,23.2%,217) HEAT(3,nWait,0.0%,335) Move(4,nWait,0.0%,344) CanReceiv(6,nWait,0.0%,797) CanSender(5,nWait,0.0%,336) CanClock(7,delaying,0.0%,350) TMC(4,delaying,0.7%,108) MAIN(1,running,69.6%,670) IDLE(0,ready,5.5%,29) AIN(4,delaying,0.8%,264), total 100.0% Owned mutexes: === Platform === Last reset 00:25:42 ago, cause: software Last software reset at 2023-12-14 20:16, reason: AssertionFailed, Expansion spinning, available RAM 13676, slot 2 Software reset code 0x4132 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00000000 BFAR 0xe000ed38 SP 0x20011fbc Task NETW Freestk 495 ok Stack: 00000919 000af3d0 0002de6f 2002bed4 2002be01 000001af 2002c100 20031458 2002c118 2001e888 a5a5a5a5 a5a5a5a5 0d312e30 00000000 00000000 00000000 20031464 00000800 20035970 2002c100 20018678 2002bf9d 20018678 2001e888 0003010f 00000000 00000000 Error status: 0x04 Aux0 errors 0,0,0 MCU revision 3, ADC conversions started 1542150, completed 1542150, timed out 0, errs 0 MCU temperature: min 35.3, current 35.9, max 38.6 Supply voltage: min 21.9, current 24.1, max 27.2, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/23, heap memory allocated/used/recyclable 2048/2048/1736, gc cycles 2851 Events: 0 queued, 0 completed Driver 0: standstill, SG min 16, read errors 0, write errors 1, ifcnt 191, reads 15599, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 1: standstill, SG min 208, read errors 0, write errors 1, ifcnt 189, reads 15599, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 2: standstill, SG min 2, read errors 0, write errors 1, ifcnt 33, reads 15598, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 3: standstill, SG min 0, read errors 0, write errors 1, ifcnt 35, reads 15598, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 4: standstill, SG min 2, read errors 0, write errors 1, ifcnt 29, reads 15599, writes 13, timeouts 0, DMA errors 0, CC errors 0 Driver 5: not present Driver 6: not present Date/time: 2023-12-14 20:42:33 Cache data hit count 2670623150 Slowest loop: 11.30ms; fastest: 0.13ms === Storage === Free file entries: 18 SD card 0 detected, interface speed: 22.5MBytes/sec SD card longest read time 8.4ms, write time 3.6ms, max retries 0 === Move === DMs created 83, segments created 0, maxWait 0ms, bed compensation in use: none, height map offset 0.000, max steps late 0, ebfmin 0.00, ebfmax 0.00 no step interrupt scheduled Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0 === DDARing 0 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 0 is on, I-accum = 0.0 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is doing "G4 P10" in state(s) 0 0, running macro Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x0000803 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === Filament sensors === in 0 notIn 0 Extruder 0 sensor: no filament === CAN === Messages queued 13889, received 31654, lost 0, errs 1, boc 0 Longest wait 2ms for reply type 6031, peak Tx sync delay 300, free buffers 26 (min 25), ts 7711/7710/0 Tx timeouts 0,0,0,0,0,0 === Network === Slowest loop: 13.57ms; fastest: 0.00ms Responder states: MQTT(0) HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) HTTP sessions: 4 of 8 === WiFi === Interface state: active Module is connected to access point Failed messages: pending 0, notrdy 0, noresp 0 Firmware version 1.27 MAC address c4:5b:be:ce:91:93 Module reset reason: Power up, Vcc 3.38, flash size 2097152, free heap 25624 WiFi IP address 192.168.10.x Signal strength -53dBm, channel 0, mode 802.11n, reconnections 0 Clock register 00002002 Socket states: 0 0 0 0 0 0 0 0
-
@T3P3Tony thanks! Test is running now. One thing I found due to the browser updating itself soon after start: when the browser was closed for update with Duet tabs open, there is a brief "incomplete transfer" message on the PanelDue - and then the Duet resets the wifi connection (3.5.0rc1 with wifi module firmware 2.1b6 - at least reconnection seems reliable now with that firmware version). After that, one of the Duet tabs was dead. The only thing that differed in the log was the amount of reconnections.
I will ignore that for now as you suggested and wait for a crash to happen. The printer is set to write an m122 log into a file afer every reboot - let's see what I find tomorrow.