"Sawtooth" Error in Motion, Even Open Loop?
-
I've been spending time in the Closed Loop plugin tuning lately using 3.5.0beta1 and I keep seeing an oscillating "sawblade" pattern in the error during movement:
The movement above is a square at a 45° angle so that only one stepper moves at a time, since it's a CoreXY printer. It's just one stepper performing a straight movement, a pause, and a straight movement back. GCode here.
I can kinda-sorta tune it out with sufficiently aggressive PID parameters at the cost of overshoot, but it still bugged me. So I looked at the motion in open loop mode and, low and behold, the same sawblade error appears:
(P.S. - I love that I can use the 1HCL to view behavior of open loop steppers.)
My first instinct was that I revealed some kind of stiction or other motion system issue, but then I took a look with the belts off. Same issue with both closed and open loop:
The previous were at 400mm/sec movement. The same pattern shows up at lower speeds as well, though. Here's 50mm/sec open loop:
Machine is using Duet 3 6HC in SBC mode + 2 x 1HCL running 3.5 beta1.
The above were with Trinamic TMCS-28-6.35-10k-AT-01 10,000 CPR / 40,000 PPR encoders on 1.8° ML23HS0L4350-E steppers at 48V. (Yes, overkill for this application.) I've seen similar with CUI AMT10E2 encoders at lower resolutions, and even tried 0.9° ML23HA0L4350-E steppers to see what I could see. Same behavior.
Why would there be a periodic error to movement like this, even in open-loop mode, and even with belts off the steppers? Am I doing something dumb? (I always assume I'm doing something dumb.)
For completeness, machine config is here, and here are the
M122
diagnostics:M122 B51 Diagnostics for board 51: Duet EXP1HCL firmware version 3.5beta1 (2022-12-23 18:43:49) Bootloader ID: SAME5x bootloader version 2.4 (2021-12-10) All averaging filters OK Never used RAM 52392, free system stack 161 words Tasks: Move(notifyWait,0.0%,108) HEAT(notifyWait,0.0%,88) CanAsync(notifyWait,0.0%,70) CanRecv(notifyWait,0.0%,79) CanClock(notifyWait,0.0%,70) TMC(notifyWait,37.0%,351) CLSend(notifyWait,0.0%,152) MAIN(running,60.9%,395) IDLE(ready,0.0%,30) AIN(notifyWait,2.0%,265), total 100.0% Last reset 00:09:11 ago, cause: software Last software reset data not available Closed loop enabled: yes, pre-error threshold: 1.00, error threshold: 2.00, encoder type rotaryQuadrature, position -76033 Encoder reverse polarity: yes, raw count 10497 Tuning mode: 0, tuning error: 0, collecting data: no Control loop runtime (ms): min=0.008, max=0.041, frequency (Hz): min=8242, max=16304 Driver 0: pos -158389, 320.0 steps/mm,ok, SG min 27, mspos 230, reads 6922, writes 64039 timeouts 0, steps req 384000 done 384000 Moves scheduled 23, completed 23, in progress 0, hiccups 0, step errors 0, maxPrep 65, maxOverdue 0, maxInc 0, mcErrs 0, gcmErrs 0 Peak sync jitter 1/17, peak Rx sync delay 201, resyncs 0/0, no step interrupt scheduled VIN voltage: min 48.0, current 48.0, max 48.1 V12 voltage: min 12.1, current 12.1, max 12.1 MCU temperature: min 37.7C, current 37.9C, max 38.1C Last sensors broadcast 0x00000000 found 0 118 ticks ago, 0 ordering errs, loop time 0 CAN messages queued 2526, send timeouts 0, received 2821, lost 0, free buffers 37, min 37, error reg 0 dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 307, adv 35645/37075 12/31/2022, 11:19:08 AM M122 B50 Diagnostics for board 50: Duet EXP1HCL firmware version 3.5beta1 (2022-12-23 18:43:49) Bootloader ID: SAME5x bootloader version 2.4 (2021-12-10) All averaging filters OK Never used RAM 52392, free system stack 167 words Tasks: Move(notifyWait,0.0%,108) HEAT(notifyWait,0.0%,88) CanAsync(notifyWait,0.0%,70) CanRecv(notifyWait,0.0%,79) CanClock(notifyWait,0.0%,70) TMC(notifyWait,37.0%,351) CLSend(notifyWait,0.2%,122) MAIN(running,60.8%,395) IDLE(ready,0.0%,30) AIN(notifyWait,2.0%,265), total 100.0% Last reset 00:09:09 ago, cause: software Last software reset data not available Closed loop enabled: yes, pre-error threshold: 1.00, error threshold: 2.00, encoder type rotaryQuadrature, position -94592 Encoder reverse polarity: yes, raw count 29056 Tuning mode: 0, tuning error: 0, collecting data: no Control loop runtime (ms): min=0.008, max=0.055, frequency (Hz): min=7075, max=16667 Driver 0: pos 36801, 320.0 steps/mm,ok, SG min 28, mspos 290, reads 65125, writes 57829 timeouts 0, steps req 384000 done 384000 Moves scheduled 22, completed 22, in progress 0, hiccups 0, step errors 0, maxPrep 66, maxOverdue 0, maxInc 0, mcErrs 0, gcmErrs 0 Peak sync jitter 0/11, peak Rx sync delay 207, resyncs 0/0, no step interrupt scheduled VIN voltage: min 48.0, current 48.1, max 48.1 V12 voltage: min 12.1, current 12.2, max 12.2 MCU temperature: min 33.1C, current 33.1C, max 33.9C Last sensors broadcast 0x00000000 found 0 161 ticks ago, 0 ordering errs, loop time 0 CAN messages queued 7740, send timeouts 0, received 2897, lost 0, free buffers 37, min 37, error reg 0 dup 0, oos 0/0/0/0, bm 0, wbm 0, rxMotionDelay 310, adv 35769/36259 12/31/2022, 11:19:04 AM M122 === Diagnostics === RepRapFirmware for Duet 3 MB6HC version 3.5beta1 (2022-12-23 18:27:08) running on Duet 3 MB6HC v1.01 (SBC mode) Board ID: 08DJM-956L2-G43S8-6J1DJ-3SJ6N-980LG Used output buffers: 1 of 40 (17 max) === RTOS === Static ram: 151524 Dynamic ram: 74784 of which 0 recycled Never used RAM 121364, free system stack 154 words Tasks: SBC(ready,0.7%,458) HEAT(notifyWait,0.0%,321) Move(notifyWait,0.0%,255) CanReceiv(notifyWait,0.2%,763) CanSender(notifyWait,0.0%,335) CanClock(delaying,0.0%,340) TMC(notifyWait,7.8%,56) MAIN(running,91.0%,953) IDLE(ready,0.3%,30), total 100.0% Owned mutexes: HTTP(MAIN) === Platform === Last reset 00:09:06 ago, cause: software Last software reset at 2022-12-31 17:09, reason: User, Platform spinning, available RAM 121160, slot 0 Software reset code 0x6000 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0x00000000 SP 0x00000000 Task SBC Freestk 0 n/a Error status: 0x00 Step timer max interval 786 MCU temperature: min 52.3, current 52.7, max 52.8 Supply voltage: min 23.8, current 23.9, max 23.9, under voltage events: 0, over voltage events: 0, power good: yes 12V rail voltage: min 12.1, current 12.1, max 12.2, under voltage events: 0 Heap OK, handles allocated/used 99/17, heap memory allocated/used/recyclable 2048/744/496, gc cycles 0 Events: 0 queued, 0 completed Driver 0: standstill, SG min n/a, mspos 4, reads 23289, writes 20 timeouts 0 Driver 1: standstill, SG min n/a, mspos 8, reads 23295, writes 14 timeouts 0 Driver 2: standstill, SG min n/a, mspos 8, reads 23295, writes 14 timeouts 0 Driver 3: standstill, SG min 43, mspos 8, reads 23284, writes 25 timeouts 0 Driver 4: standstill, SG min 84, mspos 8, reads 23284, writes 25 timeouts 0 Driver 5: standstill, SG min 85, mspos 8, reads 23284, writes 25 timeouts 0 Date/time: 2022-12-31 17:19:04 Slowest loop: 36.57ms; fastest: 0.04ms === Storage === Free file entries: 10 SD card 0 not detected, interface speed: 37.5MBytes/sec SD card longest read time 0.0ms, write time 0.0ms, max retries 0 === Move === DMs created 125, segments created 6, maxWait 191256ms, bed compensation in use: none, comp offset 0.000 no step interrupt scheduled === DDARing 0 === Scheduled moves 31, completed 31, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === DDARing 1 === Scheduled moves 0, completed 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0 Heater 1 is on, I-accum = 0.0 === GCodes === Movement locks held by null, null HTTP* is doing "M122" in state(s) 0 Telnet is idle in state(s) 0 File is idle in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger* is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 SBC is idle in state(s) 0 Daemon is idle in state(s) 0 Aux2 is idle in state(s) 0 Autopause is idle in state(s) 0 File2 is idle in state(s) 0 Queue2 is idle in state(s) 0 Q0 segments left 0, axes/extruders owned 0x1000003 Code queue 0 is empty Q1 segments left 0, axes/extruders owned 0x0000000 Code queue 1 is empty === CAN === Messages queued 5059, received 17476, lost 0, boc 0 Longest wait 13ms for reply type 6018, peak Tx sync delay 445, free buffers 50 (min 46), ts 2731/2730/0 Tx timeouts 0,0,0,0,0,0 === SBC interface === Transfer state: 5, failed transfers: 0, checksum errors: 0 RX/TX seq numbers: 22960/22960 SPI underruns 0, overruns 0 State: 5, disconnects: 0, timeouts: 0 total, 0 by SBC, IAP RAM available 0x29b04 Buffer RX/TX: 0/0-0, open files: 0 === Duet Control Server === Duet Control Server version 3.5.0-b1 (2022-12-23 20:41:36) Failed to deserialize the following properties: - ModelCollection`1 -> Int32 from 0.030 Code buffer space: 4096 Configured SPI speed: 8000000Hz, TfrRdy pin glitches: 0 Full transfers per second: 42.28, max time between full transfers: 84.0ms, max pin wait times: 68.7ms/16.0ms Codes per second: 0.36 Maximum length of RX/TX data transfers: 7436/852
-
@evan38109
I'm only wild guessing here, but the 'unused' stepper isn't totally passive during the moves.
The FW tries to hold it in it's (microstep) position, which means both coils are pulling in both directions.
Maybe your encoders are so super sensitive, that they see this "fight"? -
What's the meaning of the error value? offset from expected position in degrees/percentage/microsteps/fullsteps?
-
The value "Current error" reported by the EXP1HCL is measured in full steps.
I suspect that what we are seeing here is that for most motors, the full steps are only guaranteed to be of the same size to within 5%, in other words there may be an error of +/- 0.05 full steps. Microsteps are even less uniform.
Also, the motor may not actually move at all on some microsteps because of friction, especially at low speeds. That may be the reason for the very fine oscillation superimposed on the sawtooth pattern.
-
@o_lampe With the belts removed from the steppers? Hmm...Nah, I think there's something else going on here...
-
@dc42 That makes sense. Candidly, one of the things I've been looking forward to seeing with the 1HCL was the actual accuracy of microstepping and open loop movement. And I think it's cool to see the lag during motion, which you'd expect from stepper mechanics.
But the error isn't +/- 0.05 steps, it's much bigger -- and it's not symmetrical.
Here's three rotations forwards and backwards, with a 100ms pause in between. Open loop, no belts attached. A couple things to note:
- The period of the oscillation seems to be half a rotation.
- There's a major asymmetry, which you see above as well.
The error is oscillating around 0.2 steps in one direction and up to a whopping 0.8 steps in the other. (Most times it's closer to 0.5 or 0.6, but sometimes...well, there's the screenshot...) You can see the same asymmetry in the plots from my first post as well.
It's making it just impossible to find decent PID settings.
Why would there be such an asymmetry in error depending on which direction a stepper is moving?
Edit: Oh, and if it makes a difference, here is the backlash measurement from encoder tuning:
-
@evan38109 thanks. The amplitude of the sawtooth is of the same order as the measured backlash.
Perhaps the motor has more friction moving in one direction than the other, although I can't think why that should be.
Increasing motor current usually reduces the measured backlash, at least up to a certain point. Can you test whether it reduces the sawtooth amplitude?
Do both motors show similar errors?
-
@dc42 Well, I've got at least half an answer to the mystery. I figured out and resolved why the motion was asymetrical, though the underlying sawtooth shape and 0.15-ish step error remains.
First, regarding current: all of these are at 2.8A. These steppers are rated at 3.5A, so I can't push them too much more, but...I'll see what I can do.
I've swapped the encoders before, but I've been testing more...vigorously...recently. When I removed the encoders, I noticed a fine dust on the optical wheel. The stepper's rear shaft was just long enough to lightly rub against a screw on the underside of the encoder housing. I added a shim beneath the encoder and voila. Here's two rotations forward and two back in open loop mode at 200mm/sec, thankfully symmetrical:
Backlash is now measured consistently closer to 0.06 steps.
It's still fascinating to me that straight, constant-speed movement is...not. If anyone has any idea why that is, I'm all ears. I'd love to understand more.
That said, I'm now making headway on closed loop tuning. As a bit more realistic test, I took the outer perimeter from layer 42 of a Benchy, sliced at 150mm/sec. With some preliminary tuning, closed loop error looks like this:
The squiggles in the first half are the rear nameplate of the Benchy where it reads "#3DBenchy," while the right half is the smoother bow.
For comparison, open loop looks like this:
Closed loop is is staying within 0.2 steps in the worst case, with the mean error centered right at zero. Open loop strays to around 0.45 steps error, with significant offset from zero.
Pretty fascinating stuff.
-
@dc42 Here is the same motion at various currents. Both steppers move the same now.
TL;DR while increasing current reduces the absolute error, it doesn't look like it makes much difference to the amplitude of the sawtooth.
I've also tried all manner of other settings on the Trinamic drivers -- TBL, TOFF, hysteresis, CoolStep, etc. Ditto for accel and jerk. No difference.
Anything else you'd like me to try?
There are all two rotations forward then back at 200mm/sec, open loop, belts removed.
1.0A
2.0A
3.0A
3.5A
-
@evan38109 looks like current doesn't affect the oscillation in the errors, at the currents you were using and higher.
Is the period definitely 2 cycles per revolution, not 1? If it was 1 then that could be explained by the optical disc not being exactly concentric with the shaft.
-
@dc42 Yep, the period is definitely two cycles per revolution. It's also consistent on all motors and encoders I have, including 0.9 degree steppers and CUI AMT10E2 capacitive encoders.
Here's just one revolution out and back. I marked the shaft and watched as it went. At this scale (and with the other issue resolved), it's more of a
sinecosine wave than a sawtooth pattern. Between that and the two cycles per revolution period, makes me think it has something to do with pi, but I couldn't guess what.