Duet 2.05 memory leak?
-
Did you check a M122 to see if there were hiccups?
-
@Phaedrux no hickups -- it just starts stuttering. I am back to version 2.03 RC2 -- 2nd print with no resets is running fine -- now that's no indication that it's bug free, but it's working, so until I see a reason to move from it I'm staying on this build
-
So doing a bunch of M122s during the print, and finally caught the issue -- underruns -- @dc42 the count resets too often, and would be nice to get an error on the screen when it gets critical. I switched to a brand new class 10 sd cards, and stuttering and all weirdness stopped -- back to version 2.05.1. As smart as Duet is -- the fact that an SD card is not up to snuff, and/or is dying, should be something you can detect. Took me over 2 weeks hunting for the issue. Underruns keep resetting, so it's almost impossible to go on that. Now underruns are 0,0 -- and UI on the LCD is more responsive, shows the list of files and macros in an instant.
-
@kazolar Thanks for your persistence, and your report. SD card problems can have strange, and often not very obvious, effects. I don't know if the firmware can be set to detect SD card issues, that's one for @dc42. You can test an SD card with M122 P104 S[file size in MB], usually between 2 and 2.5Mbytes/sec. For me: Duet 2 WiFi - 2.23Mbytes/sec, Duet Maestro 2.42Mbytes/sec for a 10MB file.
Ian
-
@kazolar underruns, and any of the other stats like that, are reset each time you run M122.
-
@droftarts there is gotta be something to respond to underruns of some level. Clearly underruns were getting out of hand, if the firmware simply starts complaining about underruns how it complains about stepper phase warnings and other things of that nature, then it makes troubleshooting a lot easier, and resetting underruns seems to happen more often than just running m122. I canceled the print and all the stats in m122 underrun line was cleared out.
-
@kazolar How are the underruns actually reported in the M122? Is it just with the error status, or does it show in some other field? If you managed to save a copy of an M122 that shows it, that would be useful.
Ian
-
@droftarts here is what an M122 report looks like with underruns. This is from my own print just now. For me, it seems the underruns are from tiny segments created by simplify3d for support structures, combined with high speeds and some amount of PA.
4/13/2020, 9:29:48 AM M122 === Diagnostics === RepRapFirmware for Duet 2 WiFi/Ethernet version 2.05.1.1-simple_dynamic_unretraction running on Duet Ethernet 1.02 or later + DueX2 Board ID: 08DGM-956GU-DJMSN-6J9D4-3SJ6K-1BNBF Used output buffers: 1 of 24 (16 max) === RTOS === Static ram: 25712 Dynamic ram: 93652 of which 0 recycled Exception stack ram used: 480 Never used ram: 11228 Tasks: NETWORK(ready,628) HEAT(blocked,1232) DUEX(suspended,160) MAIN(running,3712) IDLE(ready,160) Owned mutexes: === Platform === Last reset 23:30:20 ago, cause: power up Last software reset at 2020-04-11 22:50, reason: Stuck in spin loop, spinning module GCodes, available RAM 11048 bytes (slot 2) Software reset code 0x4043 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f80f BFAR 0xe000ed38 SP 0x20001f4c Task 0x5754454e Stack: 00404463 004047e4 81000000 b0000000 412a3fa5 00000000 00000000 3331bb4c 41880000 3e178897 3e1cd04f bdb7f86e 423985c3 4050ac00 3cce8f96 40a00000 4453b9c2 c0000000 40f4ffb7 20000010 00404459 000003c8 00404aa9 Error status: 0 Free file entries: 9 SD card 0 detected, interface speed: 20.0MBytes/sec SD card longest block write time: 0.0ms, max retries 0 MCU temperature: min 36.6, current 37.6, max 38.8 Supply voltage: min 23.9, current 24.6, max 25.0, under voltage events: 0, over voltage events: 0, power good: yes Driver 0: ok, SG min/max 0/1023 Driver 1: standstill, SG min/max 0/1023 Driver 2: standstill, SG min/max 0/135 Driver 3: ok, SG min/max 0/1023 Driver 4: standstill, SG min/max not available Driver 5: standstill, SG min/max not available Driver 6: standstill, SG min/max not available Date/time: 2020-04-13 09:29:42 Cache data hit count 4294967295 Slowest loop: 17.11ms; fastest: 0.07ms I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0 === Move === Hiccups: 0, FreeDm: 158, MinFreeDm: 117, MaxWait: 0ms Bed compensation in use: none, comp offset 0.000 === DDARing === Scheduled moves: 1295584, completed moves: 1295544, StepErrors: 0, LaErrors: 0, Underruns: 595, 0 === Heat === Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1 Heater 0 is on, I-accum = 0.2 Heater 1 is on, I-accum = 0.5 === GCodes === Segments left: 1 Stack records: 1 allocated, 0 in use Movement lock held by null http is idle in state(s) 0 telnet is idle in state(s) 0 file is doing "G1 X-29.037 Y13.502 E0.0004" in state(s) 0 serial is idle in state(s) 0 aux is idle in state(s) 0 daemon is idle in state(s) 0 queue is idle in state(s) 0 autopause is idle in state(s) 0 Code queue is empty. === Network === Slowest loop: 15.90ms; fastest: 0.06ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0) HTTP sessions: 1 of 8 Interface state 5, link 100Mbps full duplex
-
@bot said in Duet 2.05 memory leak?:
=== DDARing ===
Scheduled moves: 1295584, completed moves: 1295544, StepErrors: 0, LaErrors: 0, Underruns: 595, 0Thanks, I know where to look now!
Ian
-
@droftarts I think I read that the first number is a warning, the 2nd number will cause stutter or a pause if it gets bad. I can tell from switching SD cards, my gcode uploads are faster now -- hitting 700kb/sec -- almost maxing out the 100mb link -- never had over 500 before.
-
@kazolar I think the first value isn't a warning, just an indication that the lookahead function couldn't do something (not sure what) with the time given. It doesn't slow down the print, but is likely not ideal. The second number is a prepare move underrun, which means that the move could not be prepared in time and so the movement must wait. This is much worse than the first one.
Also, since I'm interested in SD card performance at the moment, I noticed your last comment and must correct you somewhat, just for your info: 700 kB/s is not nearly maxing out a 100 Mbps link. 100 Mbps = 12.5 MB/s
-
@bot yep, 2nd number is the one that i would hope would be something Duet would alarm about -- yep -- I got my decimal off --was thinking 10mb, i think sleeping more than 5 hours per day maybe catching up to me. What's curious though is I never saw numbers above 500kb/s transfers -- with the new sd card it was in the high 700 -- touching 800, I mean even before I had issues with the SD card -- the old one wasn't that fast to begin with.
-
@dc42 -- what kind of an sd card do i need:
it's happening again:
Scheduled moves: 24643, completed moves: 24607, StepErrors: 0, LaErrors: 0, Underruns: 80, 300 -
10:42:27 AMSD write speed for 20.0Mbyte file was 2.18Mbytes/sec
10:42:18 AMM122 P104 S20
Testing SD card write speed...I printed the same file multiple times -- no issues -- now I get an error -- does it have anything to do with duet being powered on/off. Works fine when I copy over a newly sliced/copied file -- but for a file that has been other for a couple of days - - start getting underruns again
Should i send a sd card dismount command m22 i think before powering the machine off -- something is definitely getting corrupted -- i did a bunch of commands, started, canceled a print -- and a newly copied file works fine. -
Pending any other surprises -- I now am sending M22 before the printer is powered off (unless it's a an unexpected shutdown) -- with that enabled, I have not had any more underrun problems.
-
@dc42 totally reproducible now -- printed fine with the same file for 2 days, 3rd day after power up -- 30 minutes in underruns . Stopped it, didn't reset, didn't power off, deleted the file, uploaded the exact same file -- now no underruns, working fine. I guess I can keep doing this procedure, but why are "old" files now going stale on the sd card somehow?
-
@kazolar And this is on 2.05? Or 2.02?
-
@kazolar I can't see how anything in the firmware is doing this. It doesn't rewrite the file to the card, except if you run simulation (it appends the simulation time to the gcode file). My guess is that the SD card is doing some form of wear levelling, but causing issues doing it. What exact card (make and model) is this. I know you swapped to a new card from the one that was causing problems originally, but have you tried yet another?
Ian
-
I had the problem originally with the card that came with the duet2, like 3 years ago, so I figured it must be time to swap. I had the same issue with amplim brand, then with sp brand. These have worked perfectly fine in pis. Both cards are rated class 10
-
Also the problem is the same with each version. If I copy a fresh file, it works fine for a couple of days and 2 power cycles, then like clockwork the next print 30 minutes it starts to stutter, but I'm sure underruns were just piling up. I checked and the 2nd under run number was at 300, so it was pretty bad.