Posts made by shogran

undefined

@dc42 Printed without any issues after reducing the sensitivity and not having the stall detection trigger at all that I could see. Looks like that may have been the cause.

undefined

@dc42 Thanks, I've restarted the print with a reduced sensitivity on the stall detection and logging enabled. I'll update here with the outcome.

undefined

@dc42 said in Print stopped at 83% for seemingly no reason:

Thanks.

It looks like it occured at layer 127 (Z height 17.66mm)

The GCode file says that layer 127 has Z height 12.700mm. So do you mean layer 127, or 12.700mm height? It's possible that DWC is displaying the wrong layer count.

Ah, that would make sense actually as it was printing with a 0.1mm layer height. DWC may have been displaying the wrong info in this instance.

I've reviewed the part of the firmware that handles an emergency pause (which is done when there is power loss or a stall is detected), and I found two issues:

When an emergency pause occurs, an off-by-one error meant that the number of scheduled moves ends up one higher than it should be;

After an emergency pause, the queue of deferred commands was not being resynced to allow for moves that were discarded from the queue.

What I think happened was that you had a large number of repeated stalls around the same point in the print, during or shortly before bridging moves. Each one caused the number of scheduled moves to increase by one, and some deferred M106 commands to be left in the queue instead of being purged. This explains the following part of the M122 report:

Code queue is not empty:
Queued 'M106 S204' for move 401967
Queued 'M106 S102' for move 401968
Queued 'M106 S204' for move 401970
Queued 'M106 S102' for move 401975
Queued 'M106 S204' for move 401978
Queued 'M106 S102' for move 401979
Queued 'M106 S204' for move 401981
Queued 'M106 S102' for move 401986
Queued 'M106 S204' for move 401989
Queued 'M106 S102' for move 401990
Queued 'M106 S204' for move 402016
Queued 'M106 S102' for move 402017
Queued 'M106 S204' for move 402019
Queued 'M106 S102' for move 402024
Queued 'M106 S204' for move 402027
Queued 'M106 S102' for move 402028

The move numbers are so close together that I think the multiple pairs M106 moves must repetitions of the same sequence of bridging moves.

Do you think it possible that the stall detection was triggered 150 or more times, with at least the last 8 being all around the same place? If you have logging enabled, then the stalls may be recorded in the log file.

Unfortunately I didn't have logging enabled, though I have now added it to my config.g for future use. I think it's possible that it was triggered that many times as I've only recently enabled it and am still tuning the sensitivity. I've definitely been in the room a few times when it has triggered a few times in the space of a few seconds on the same point before moving past it.

undefined

@dc42 said in Print stopped at 83% for seemingly no reason:

@shogran said in Print stopped at 83% for seemingly no reason:

I do have stall detection enabled which I know triggered a few times; could that have caused it/acted the same way as a pause?

Thanks, that may be where the problem is. What action do you have configured when a stall is detected?

This is my stall detection line:

M915 X Y S10 R3 F0 ; Stall detection, set X and Y to sensitivity 10, pause print, rehome, unfiltered

and this is the rehome.g file it's using:

G91                       ; relative mode
G1 Z5 F200 S2; lower bed
G1 S1 X-305 Y274.5 F3000   ; course home X or Y
G1 S1 X-305              ; course home X
G1 S1 Y274.5               ; course home Y
G1 X4 Y-4 F600             ; move away from the endstops
G1 S1 X-305                ; fine home X
G90
G1 Y274.5             ; fine home Y
G91
G1 Z-5 F200 S2; lower bed

undefined

@dc42 said in Print stopped at 83% for seemingly no reason:

I'm sorry, I think you have found a firmware bug. This line from your report:

Scheduled moves: 402064, completed moves: 401876

has a difference of 188, which is much greater than the size of the move queue. So it must be failing to count some completed moves. That has caused the code queue to become full with M106 commands; so the firmware is waiting for the completed move count to catch up.

I think your only hope of saving the print is to pause it, check that a resurrect.g file has been created, use Emergency Stop to reset the Duet, and use the resurrect feature (M916) to restart the print. ~~I think firmware 2.02RC3 inserts a G92 command in resurrect.g prior to calling resurrect-prologue.g, and if that is the case then you probably only need a M116 command in resurrect-prologue.g.~~

EDIT: I just checked the code. To avoid resetting the Duet (which will power the motors down, potentially losing position) you can pause the print, back up the resurrect.g file, then cancel the print. Then use a text editor to remove the G92 command from resurrect.g. Then use M916 to resume the print (as before you will need M116 in the resurrect-prologue.g file).

@dc42 said in Print stopped at 83% for seemingly no reason:

PS - I'm trying to track down how the move counts became so different. Did you pause the print a large number of times?

Thanks for the feedback dc, much appreciated.

I did pause and resume the file a few times when I noticed it wasn't printing but not while the print was working. I do have stall detection enabled which I know triggered a few times; could that have caused it/acted the same way as a pause?

undefined

Hi there,

Started an overnight print that should have taken ~10 hours and noticed the next day it was still sitting at 83% via the web interface. Thought nothing of it and checked back an hour later to see it was still at the same point, I assumed the interface had frozen so I went and checked in on the printer and saw that the printhead was stopped in place on the incomplete model.

The bed and nozzle were still hot and according to the interface at the correct temperatures of 75 and 240 degrees respectively and I could pause the print which would move the head back to 0, then resume. However the resumed print would just continue to sit in the same place. There were no errors in the console and no matter how many times I paused and resumed the print it did not continue. In the end I ran M122 and then cancelled the print. I'm running version 2.02RC3 with server version 1.21 and web interface 1.22.4-b1. The GCODE file was sliced in Simplify3D 4 and uploaded via FileZilla, I've downloaded the file and it looks fine to me. It looks like it occured at layer 127 (Z height 17.66mm). The machine is a CoreXY.

Due to the size of the files I've put the GCODE and some screenshots/photos here along with my config.g file and the M122 text which I'll try to get below as a code snippet: https://www.dropbox.com/sh/384kq6ocd85e95e/AADj-WkBKcL7vvWY1bmKvIn8a?dl=0

If I can provide anything else that will make diagnosing this easier then let me know!

EDIT: If I wrap the code below in the ``` code tags I get told I can't submit as "Post content was flagged as spam by Akismet.com".

10:09:35
M122
=== Diagnostics ===
RepRapFirmware for Duet 2 WiFi/Ethernet version 2.02RC3(RTOS) running on Duet WiFi 1.02 or later
Board ID: 08DDM-9FAM2-LW4S8-6JTDG-3SD6T-13YBW
Used output buffers: 1 of 20 (18 max)
=== RTOS ===
Static ram: 28532
Dynamic ram: 98600 of which 0 recycled
Exception stack ram used: 624
Never used ram: 3316
Tasks: NETWORK(ready,328) HEAT(blocked,1232) MAIN(running,3484)
Owned mutexes:
=== Platform ===
Last reset 14:38:10 ago, cause: power up
Last software reset at 2018-12-04 19:40, reason: User, spinning module GCodes, available RAM 3452 bytes (slot 1)
Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f000 BFAR 0xe000ed38 SP 0xffffffff Task 0x4e49414d
Error status: 0
Free file entries: 9
SD card 0 detected, interface speed: 20.0MBytes/sec
SD card longest block write time: 0.0ms, max retries 0
MCU temperature: min 38.6, current 38.7, max 38.9
Supply voltage: min 23.2, current 24.2, max 24.5, under voltage events: 0, over voltage events: 0
Driver 0: standstill, SG min/max not available
Driver 1: standstill, SG min/max not available
Driver 2: standstill, SG min/max not available
Driver 3: standstill, SG min/max not available
Driver 4: standstill, SG min/max not available
Date/time: 2018-12-07 10:09:32
Cache data hit count 4294967295
Slowest loop: 1.25ms; fastest: 0.07ms
=== Move ===
Hiccups: 0, StepErrors: 0, LaErrors: 0, FreeDm: 240, MinFreeDm: 240, MaxWait: 0ms, Underruns: 0, 0
Scheduled moves: 402064, completed moves: 401876
Bed compensation in use: mesh
Bed probe heights: 0.000 0.000 0.000 0.000 0.000
=== Heat ===
Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1
Heater 0 is on, I-accum = 0.4
Heater 1 is on, I-accum = 0.6
=== GCodes ===
Segments left: 0
Stack records: 2 allocated, 0 in use
Movement lock held by null
http is idle in state(s) 0
telnet is idle in state(s) 0
file is doing "M106 S204" in state(s) 0
serial is idle in state(s) 0
aux is idle in state(s) 0
daemon is idle in state(s) 0
queue is idle in state(s) 0
autopause is idle in state(s) 0
Code queue is not empty:
Queued 'M106 S204' for move 401967
Queued 'M106 S102' for move 401968
Queued 'M106 S204' for move 401970
Queued 'M106 S102' for move 401975
Queued 'M106 S204' for move 401978
Queued 'M106 S102' for move 401979
Queued 'M106 S204' for move 401981
Queued 'M106 S102' for move 401986
Queued 'M106 S204' for move 401989
Queued 'M106 S102' for move 401990
Queued 'M106 S204' for move 402016
Queued 'M106 S102' for move 402017
Queued 'M106 S204' for move 402019
Queued 'M106 S102' for move 402024
Queued 'M106 S204' for move 402027
Queued 'M106 S102' for move 402028
16 of 16 codes have been queued.
=== Network ===
Slowest loop: 178.61ms; fastest: 0.08ms
Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0) Telnet(0)
HTTP sessions: 1 of 8

WiFi -
Network state is running
WiFi module is connected to access point
Failed messages: pending 0, notready 0, noresp 0
WiFi firmware version 1.21
WiFi MAC address 5c:cf:7f:ee:68:10
WiFi Vcc 3.39, reset reason Turned on by main processor
WiFi flash size 4194304, free heap 14168
WiFi IP address 192.168.1.120
WiFi signal strength -76dBm, reconnections 0, sleep mode modem
Socket states: 0 0 0 0 0 0 0 0
=== Expansion ===
10:09:33
Message Log cleared!

undefined

That's with everything put back. Going to tune it all in properly and get a mesh compensation run once I have it a bit tighter

undefined

So it's not perfect but, it's an improvement!

0.533 0.533 0.519 0.501 0.527 results from probing the same point.

Thanks all for your patience and feedback. You've been awesome!

undefined

Okay, I ran the same macro as is with all those settings turned right down and M350 commented out. Unfortunately no fix.

Reset the settings and bumped Z motor amps up to 2.0 from 1.8. No better.

I then ran with a 2 second gap between every command and this did not work.

If I command the Z +10 then -10 via the web control buttons, it moves up 10 and down 10. ~~No problems. I noticed this uses G91 before the movements would this be making a difference?~~

EDIT: It appears to be struggling on the move ups intermittently using the web controls. This leads me to believe the problem is not with the bed lowering too far, but raising it enough. I'm going to take the screws off and re-assemble

undefined

Escaping the office in 2 hours, will try those things first.

All motors are 1.8°. In theory my steps/mm are correct based on calculations using those variables. Just using that as an example of a time when the bed moves in the downwards direction without dropping.

undefined

@number40fan:

No problem. While I am thinking about it though, if you still don't see any looseness and since the last thing you did was replace a lead screw…take that one back out and flip it over and test again. Maybe, it is rougher on one side versus the other. Rough coming down, no biggy because you have gravity helping.

There's still some wiggle room in the amps being given to the Z motor so I'll play with that a little bit. Thinking about it, because of the CoreXY design, whenever I measure bed movement to check steps/mm; I measure it lowering and it always seems to move correctly unless the downward motion follows an upward motion as we discovered last night

Is there any significance there? Perhaps acceleration or jerk could be moving it too quickly in the opposite direction?

undefined

@number40fan:

Has to be torque. One more test. Lower your Z speed down to almost nothing. I see it is already low, but go lower. Run the last thing I posted while watching for anything that might not be doing what it is supposed too.

I've just had to power down, up early for work in the morning.

I'll run it slow first chance I get tomorrow and let you know what it does.

Many thanks again!

undefined

@number40fan:

If you want to increase the time, just change the S*. That is in seconds or you can change it complete to P**** and run it in milliseconds.

Well damn. It's not the probe. Thank-you for coming up with that test because it's very likely I wouldn't have ever considered it. Got hung up on the probe being off.

For some reason my bed is lowering an extra ~2mm when moving down and not when going up. Not enough holding torque?

I'm running a three leadscrew system powered by one hefty NEMA17 with 2:1 gearing.

undefined

@number40fan:

When you probed the same point over and over and had the failure, I think you can rule out a bad connection that is caused by any wire movement on the BL.

Did the G32 pause between probing and lowering the bed? Curious if that worked or not.

It did indeed! Very nifty, ran it again as I realised it gave me time to measure the drop. It looks like it's losing 2mm every probe.

I'll try that macro over 25mm now

undefined

@number40fan:

I think the stalled is just where it is when you ran the test. (Not entirely certain) Just tried it with my printer that is running and it changes every time I run M122. It doesn't show an error, so that is good. As for why it is happening, not sure. Hope to get you to run the G32 test and see how it goes.

Thanks for taking the time to look.

Just ran the G32 with your code and its failing in the same way unfortunately, probes 4 points then fails. I've checked my bed movement, and when I command a 10mm move, it definitely only moves 10mm so it's not running too many steps per mm.

I was wondering if there could be a loose connection somewhere, but the first reading I do seems accurate. It's just all subsequent readings seem to move the bed. Would have thought if the connection was off either all readings would be funky or it wouldn't work full stop.

undefined

@number40fan:

Does M122 show any errors?

To me, it seems like the Z motor just doesn't have the power to push the bed back up. It would have to be missing steps, which I think M122 would show.

I wonder how it would act if you set up probing points for G32 and put a G4 P*** after every move of the bed.

Running M122 I see some stalled drives, does that mean it's missing steps? If this is the case any ideas why this would suddenly occur here? Before adding the BL Touch I printed a few parts on my old endstop setup and never noticed the bed not operating correctly or any artifacts in the Z axis on my prints.

[[language]]
M122
=== Diagnostics ===
Used output buffers: 3 of 32 (14 max)
=== Platform ===
RepRapFirmware for Duet WiFi version 1.19.2 running on Duet WiFi 1.0
Board ID: 08DDM-9FAM2-LW4S8-6JTDG-3SD6T-13YBW
Static ram used: 21176
Dynamic ram used: 96040
Recycled dynamic ram: 1568
Stack ram used: 1304 current, 9152 maximum
Never used ram: 3136
Last reset 00:04:06 ago, cause: power up
Last software reset reason: User, spinning module GCodes, available RAM 3184 bytes (slot 0)
Software reset code 0x0003, HFSR 0x00000000, CFSR 0x00000000, ICSR 0x00400000, BFAR 0xe000ed38, SP 0xffffffff
Error status: 0
Free file entries: 9
SD card 0 detected, interface speed: 20.0MBytes/sec
SD card longest block write time: 0.0ms
MCU temperature: min 32.4, current 32.8, max 33.1
Supply voltage: min 24.1, current 24.2, max 24.4, under voltage events: 0, over voltage events: 0
Driver 0: stalled standstill
Driver 1: stalled standstill
Driver 2: stalled
Driver 3: standstill
Driver 4: standstill
Date/time: 2017-09-06 00:11:24
Slowest main loop (seconds): 0.005646; fastest: 0.000092
=== Move ===
MaxReps: 1, StepErrors: 0, FreeDm: 239, MinFreeDm 239, MaxWait: 6374ms, Underruns: 0, 0
Scheduled moves: 21, completed moves: 20
Bed compensation in use: none
Bed probe heights: -0.324 -0.891 -1.685 -2.534 -3.688
=== Heat ===
Bed heater = 0, chamber heater = -1
Heater 1 is on, I-accum = 0.0
=== GCodes ===
Segments left: 0
Stack records: 2 allocated, 0 in use
Movement lock held by file
http is idle in state(s) 0
telnet is idle in state(s) 0
file is idle in state(s) 31
serial is idle in state(s) 0
aux is idle in state(s) 0
daemon is idle in state(s) 0
queue is idle in state(s) 0
autopause is idle in state(s) 0
Code queue is empty.
Network state is running
WiFi module is connected to access point
WiFi firmware version 1.19.2
WiFi MAC address 5c:cf:7f:ee:68:10
WiFi Vcc 3.12, reset reason Turned on by main processor
WiFi flash size 4194304, free heap 37176
WiFi IP address 192.168.1.120
WiFi signal strength -80dBm
Reconnections 1
HTTP sessions: 1 of 8
Socket states: 0 2 0 0 0 0 0 0
Responder states: HTTP(1) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0)

undefined

I've run the probe single point code from this post: https://www.duet3d.com/forum/thread.php?id=1330

And I can confirm that the bed is moving lower after every probe. I cannot work out what in the code could cause this.. Or am I just seriously misunderstanding what is supposed to be happening?

My results were:

-0.25
-1.27
-2.8
-4.58
-5 (Failure, probe not triggered)
Failure, probe not triggered for the rest of the runs and the bed moves progressively lower after every probe attempt.

undefined

Okay, I'm now running 1.19.2 and AJAX disconnects seem to be less of an issue now. The connection is still nowhere near as stable as 1.18 was though. New rods are in so there's an extra 15cm above the top end of the Z-axis so no chance of accidentally running off of the rods.

On your suggest Walter I printed the calibration tool you made, thank-you! With my nozzle touching the bed, it seems to fit near perfectly into the device. I have the bed roughly levelled since fitting the new rods and printed a quick square brim just to check it was roughly flat across the majority of the bed.

The touch still does not like to do mesh compensation probing though. I can probe in the point where it fails manually and it's successful. I held a ruler against the structure and it seemed that after probing each point it was going 1mm lower each time. For example, dive height is 5mm. First point would dive 5mm, move to second, dives 6mm, third, dives 7mm etc. And it seems to fail around the 4th point now. Maybe I'm going stir crazy though as I can't see anything in the code that would cause that.

Here's an Imgur album of my current setup and the brim that was printed: http://imgur.com/a/3eyCp

EDIT:
Here is a YouTube video of the probing failing now. It definitely looks to me like it's noticably lower by the 5th probe: https://youtu.be/dXOKNOCKlCg
Here is a YouTube video of me successfully probing that same point independently: https://youtu.be/Sby8lnM2F10

Apologies for shakey cam

undefined

Appreciatre all the help so far. I'll print that off and see what I can do to lower the probe a bit if necessary.

By manually probing the bed after re-levelling I think I've come across the issue. The difference between the highest and lowest points is huge.. I think every time I'm levelling the bed ome corner is consistently lower and I'm not noticing as the points I paper test are okay.

Going to make sure the Z carriage is as level as possible and go from there. Perhaps longer rods are the solution. As you pointed out it's very close to the limit of the rods.. it could be popping off slightly on one of the rods and then moving out of alignment every time it homes and comes down again.

Porbing at each corner gave me:

FrontLeft 1.022
BackLeft 1.397
BackRight 1.470
FrontRight 0.799

undefined

homez.g

[[language]]
; homez.g
; called to home the Z axis
;
T0 ; select tool
G91 ; relative coordinates
G1 Z5 F200 ; lower bed
G4 P500 ; wait for the bed to lower
G90 ; absolute positioning
G1 X50 Y50 F3000 ; go to first probe point and home the z axis
M401 ;  deploy the probe
G30 ; calibrate Z-axis
M402 ; retract the probe

homeall.g

[[language]]
G91                       ; relative mode
G1 S1 X-305 Y274.5 F3000   ; course home X or Y
G1 S1 X-305              ; course home X
G1 S1 Y274.5               ; course home Y
G1 X4 Y-4 F600             ; move away from the endstops
G1 S1 X-305                ; fine home X
G90
G1 S1 Y50              ; fine home Y

T0 ; select tool
G91 ; relative coordinates
G1 Z5 F200 ; lower bed
G4 P500 ; wait for the bed to lower
G90 ; absolute positioning
G1 X50 Y50 F3000 ; go to first probe point and home the z axis
M401 ;  deploy the probe
G30 ; calibrate Z-axis
M402 ; retract the probe

Quick video here of Z home, then probe. I cut before the probing fails as it continues on like this, then docks and fails. https://youtu.be/xX1OaR1FHOw

I did notice that before it fails, when the bed goes down the Z level reads 6mm and as it comes up to push the probe it reads -4.57 or so. Then when the probe docks and the bed moves to 20mm, lowing the bed to 0 results in a large gap between the nozzle and bed. Rehoming corrects this, is this cause or affect of the issue?