1HCL and cumulative deviation from accurate position?

tristanryerparke

Hi All,
I'm running a PNP-style machine with MB6HC(SBC Pi5), 5x 1HCL. During continuous operation I noticed some drift in the A-axis. The drift was large enough that when commanding the (rotational) A axis to a 0 position (which looks square to the gantry after a homing cycle), it looks visibly off by about 10 or 15 degrees.

I was running some g-code which "unwraps" the A axis by calculating % 360 of the last position (I generate my code with python) and running a G92 A{remainder} to avoid a lengthy travel back to the normal range. I thought this might be causing the cumulative positional drift, and disabled it which seemed to fix the issue. But I was frustrated that what seemed like an obvious solution to the lengthy unwind wasn't working, so I ran some tests.

How I'm testing for deviation:

G91 
G53 G1 H4 A-360 F3000
G53 G1 A2
G53 G1 H4 A-4 F100
G90 
G4 S0.5 ; wait for motion to complete and allow OM to settle (nessecary? or maybe the issue?)
echo {"deviation from home position: "^move.axes[3].machinePosition + 142.5}
; 142.5 is the offset I specify in homea.g

The deviation tests involve running a homing-style motion but not setting any position, and then observing how far the current position is from the supposed home position by polling the OM. To me this seemed a viable method of recording the deviation, and I hope it does to everyone else as well. I've used similar tests in the past to verify the position of open loop systems during operation.

The two test files which simulate an identical set of motions that my machine currently makes during normal operation, one with the G92 unwrap and the other without:

My testing script:

echo "homing the A axis to reset"
G28 A
echo "initial tolerance score"
M98 P"/macros/test_a_issues/print_deviation.g"

;default
M203 A25000	M201 A2000	

;slow 1
;M203 A10000 M201 A500

while iterations < 20
    echo "iteration ", iterations + 1
    ;M98 P"/macros/test_a_issues/with_python_mod.g"
    M98 P"/macros/test_a_issues/with_no_wrap.g"
    M98 P"/macros/test_a_issues/print_deviation.g"

This allowed me to observe the deviation over time as the problematic macro(s) were repeatedly run, and the results were quite worrisome. Graphing my console output csv from multiple runs produced this:

I tried the same motion macros with several different configurations, open loop mode, closed loop mode, assisted, slow accel and feedrate etc. Uncommented lines in my testing script and homing files should show how I produced the varying tests, and attached are my testing_x.g and print_deviation_x.g files that I used as a "control".

A axis runs the following hardware:

1HCL configured as shown in config.g and homea.g
This Closed Loop Stepper
This geared stage 1:10 ratio
PM-R45 NPN Optical endstop on the output face of the stage

X axis uses a 1HCL, Nema23 motor and a 40T GT2 Pulley with 9mm Gates belt on a high quality linear rail.

The accumulation I observed with the G92 wrapping was apparent, but shockingly, its omission showed a serious drift moving in the opposite direction. This led me to think that this was a hardware issue, so I added tests with my simple belt driven (direct drive) X axis as a control. This showed a continual deviation increase of 0.5-1mm across the standard 20 iterations with the same file, despite the mode of operation.

So I have the following set of questions to ask on the forum:

Is this a viable way to determine the deviation, and if not why? Do I need to give the object model more time to sync up before I make the comparison? If that is the case why are upward/downward trends visible?
Is the issue related to the high speed at which I am running the A axis motor shaft, and why did a reduction of speed and acceleration not solve the cumulative error issue?
Due to the 10x gear reduction of my A axis, could I be running into floating point errors for the shaft position/micro-step calculation? Maybe some tests with low micro stepping could answer this.
On the X axis control, what other than a slipping belt pulley could cause the visible increase in X error.

Attached is my m122.txt M122 report after a single run of the with_python_mod.g macro. No blazing issues stand out to me from the report.

I'm happy to run additional tests and provide more information to diagnose the issue.

Cheers and hoping to figure this out,
T

o_lampe

@tristanryerparke Your test uses endstops to verify the error.
IMHO they're not perfectly precise to begin with.

Then you use a gear reduction. Is there a backlash compensation somewhere?

With gear reduction and microstepping, the pulse count is very high. You might see missed pulses on the input side of the 1HCL and/or missed encoder pulses at high speed.

Then there is your python script: do you wrap the position by 360° (integer), or do you use floating point 360.0f °? ( sorry. had to ask)
What do we know about micropython? Does it use same FP-precision as the 1HCL MCU?

If I was in your shoes, I'd redo the tests. Start with endstop error
Then do each test with full speed, slow speed and again with reduced microsteps.
Don't put them all in one spreadsheet. It's much easier to compare the output.

BTW: do you home the A-axis always in the same rotational direction? If not, there's an error caused by the tongue-width. (if it's a light barrier endstop)

dc42

@tristanryerparke thanks for your comprehensive report. I am currently out of the office so unable to do a full analysis at this time.

I have some observations on your method, in particular on the use of G1 H4 and G92 commands. When you execute the G1 H4 command, at the point at which the endstop is triggered, the motor is commanded to stop immediately. In open loop mode this could result in skipped steps if the speed is high and there is enough inertia. In closed loop mode it will result in overshoot but the position should be corrected. I recommend that you execute G1 H4 moves slowly, as is normally the case for homing moves.

There is another cause of overshoot when the motor is being driven by an expansion board as in this case. When the endstop is triggered, a CAN message is sent to tell the motor to stop. Due to CAN latency there will be a small delay before this is received and acted on. Subsequently the main board sends another CAN message telling the expansion board the position, rounded to the nearest microstep, at which it should have been when the endstop triggered. The expansion board then adjusts the motor position back to this point. If this mechanism is working correctly then it should not affect your results.

In closed loop mode, motor position is represented as a 32-bit floating point number. So it has 24 bits of precision. From this you can work out how many rotations of the A axis you can make in the same direction while keeping to within a specified error. For example, if you can tolerate a 1 degree error then you can rotate up to 2^24 degrees without exceeding that error. A further limit is that the motor microstep position is represented as a 32 bit signed integer, so the maximum rotation should not exceed 2^31 microsteps.

HTH David

tristanryerparke

Diving deeper into this and heeding your advice @dc42 @o_lampe. I've made several changes to how I'm testing:

Switched the A axis to full-step mode with M350 A1 I1 in config.g as it can only help... (Should I be interpolating or no???)
I'm using python3.11 on my RPI to generate the g-code, but I upped my general gcode output precision to four decimal points for the new "problematic macro" and the one that unwinds the axis.
Stopped using G1 H4 to avoid missed steps, instead I'm creeping up on the endstop with a set of loops like this (see a_to_endstop.g, x_to_endstop.g) for the full macros:

G91             ; Absolute positioning
; Small steps
while sensors.endstops[3].triggered == false
    G1 A-0.02 F500
    M400
G90             ; Relative Positioning

Interestingly I can hear the A axis clicking out the full steps when moving at this slow speed towards the endstop.
This should give accuracy to the nearest ~0.02 units which is good enough for me. Here are the results of my accuracy tests where I home initially and then run 25 measurement macros interspersed with a back-off command (test_method_accuracy_X.g, test_method_accuracy_A.g) :

Knowing that this new method is reliable, I ran some tests on the A axis with the "problematic macro" (
test_A_wrap_closed.g, test_A_nowrap_closed.g) :

I'm still observing the issue I had originally, although the full-stepping seems to have decreased the amount of error. The macro that unwraps still shows some annoying increase of 1-ish degree over 100 iterations (although it does seem to stabilize). @o_lampe I wonder if this hints toward your hypothesis of missed steps/pulses at high speed since there are increasing deviations for both macros? Still the macro with G92 is a big issue after 100 runs.

Here is the same set of tests on the X axis (test_X_wrap_closed.g, test_X_nowrap_closed.g) :

This is a completely acceptable deviation, and suggests to me that the issue has to do with my geared stage and the A motor moving so fast.

I'm in the middle of running the A tests again with limited speed and acceleration (although it is taking a very long time) and will update again as soon as I have the data.

Relevant files:
m122.txt
without_wrap_4f.g
with_wrap_4f.g
homex.g
homea.g
config (1).g

@dc42 I'm not sure if it makes a difference now that I'm using this discrete-creep-up method to test the accuracy, but both my A and X axes have their endstops connected to the same board as the motor/encoder.

Would it be beneficial for me to run these tests in open loop mode as well?

Cheers,
T

o_lampe

@tristanryerparke The mist is clearing!
Thanks for repeating the tests again. Using interpolation should be fine, since it's a driver-internal thing.
It shows that the python-wrap is still buggy and from the ~90th iteration seem to reach a new level. Why?

Chasing missed pulses via CAN is beyond my knowledge; maybe have the mainboard and the 1HCL count microsteps and compare them after 100 iterations? @dc42 ?

dc42

@tristanryerparke are you using 3.5 or 3.6 firmware? This could make a difference.

Regarding setting the A axis to full step mode, the effect of this is that the commanded position of the A axis will be rounded to the nearest full step position. So in effect you have reduced the available resolution to 1 full step. As you have a geared A axis, I guess this is acceptable to you. For this application I would have suggested an ungeared axis and our magnetic encoder to provide improved resolution. In general we prefer the magnetic encoder over quadrature encoders built into the motor, not just because of the higher resolution but also because they report absolute position, so they don't require a tuning move after power up. However, if you would also need a hollow shaft for vacuum pickup then I guess this option may not be viable.

tristanryerparke

Here are the results from the slow tests (speed and acceleration: M203 A8000 M201 A100) :

Very little variance, and if so a result of the measurement system/full stepping. So I'm guessing it is a problem with the high speed I'm using...? I'll keep bumping up the speed to try and find when the errors begin to occur.

@dc42 I need to run through-hole slip ring wiring, but I'm considering switching to the 5:1 ratio version of this reducer and maybe also a magnetic encoder. I'm only full stepping to see if it would help retain positional stability at these high speeds, and I'd like to have more accuracy if possible.

It seems like these kinds of motor do not give an absolute position read, the encoder type is listed as "magnetic incremental" and the pinout does not suggest SPI. I could potentially switch to the Duet3d magnetic encoder, but my reason for using these other motors/encoder is that I need a relatively sealed system, so I'd have to design a case with wire exits to go around it. I haven't found any sealed glue-magnet-style encoder motors in the 17 frame size, but maybe they exist?

I'm running 3.5.2 at the moment, would you suggest 3.6 and if so should I install 3.6.0-alpha.5+1 on the mainboard via bossa since I'm using an SBC?

Thanks,
T

o_lampe

@tristanryerparke said in 1HCL and cumulative deviation from accurate position?:

need to run through-hole slip ring wiring

I don't trust those slip rings too much. Even when you find a working speed threshold for now, it will decrease eventually. Or you'll see sporadic failure.

dc42

@tristanryerparke to use 3.6.0-alpha.5+ in SBC mode, currently you have to install version 3.6.0-alpha.2 from the package server, then upload the alpha 5+1 firmware via DWC.

tristanryerparke

Running 3.6 beta5+1 and the normal test_A_wrap_closed.g, test_A_nowrap_closed.g I got these results:

I'm now just trying to find a middle ground speed/acceleration combo where the missed steps do not occur, yet isn't prohibitively slow. @dc42 Any ideas on a semi-sealed, magnet-style closed loop motor? Maybe some existing product I could retrofit the duet3d encoder to?

Best,
T