Mutiple Motion Systems & Macro Conflicts in 3.5
-
Hi everyone,
I'm but a simple FDM printer guy trying to use RRF to run my Voron printer. This has had some hurdles for sure because Klipper is the go-to there but I knew RRF from the early days of FDM so I kept going down this path. I just recently updated to 3.50 RC3 (on an STM port of duet)
I have constructed a multi-material unit for my printer and been busy putting together a library of macros to allow automated filament changes. I happily tested these macros in isolation through DWC and everything was working great. Finally today was the first test on a slicer-generated Gcode that called on my macros, and then all hell broke loose. Despite a number of M400 commands scattered in the code, my macros would fail out on conditionals 10-15s ahead of the movements that should have preceded them.
Because I am using an STM port, I asked on the discord there and it appears that my issue is the multiple motion system. The documentation suggests that the way it works is that when a macro is loaded BOTH motion systems run it, searching for any M596 commands to allow it to queue up a move command for the appropriate motion system. However, what is not defined is the behavior in respect to non-motion commands and meta commands, because it seems that indeed both systems will run this which is why my aborts were happening. In fact, those aborts are often happening on both streams close enough that I get the abort message twice!
I find it a bit hard to believe that this is the intended behaviour of the MMS... Even if the point is to allow the two systems to deconflict, it completely changes the way you have to write a macro in order to ensure that multiple asynchronous runs of the same metacommands don't interfere with each other. A lot of my operations happen on global variables which suddenly become non-deterministic when run in two streams. Especially in my case, where I have no need or desire for a second motion stream, it seems like the default behaviour should be for the second stream to do nothing at all unless otherwise commanded.
Effectively it means my days of macro coding are currently unusable, because at the moment I can't see any way to stop the second stream running. I can place M598s everywhere which fixes conditionals but doesn't stop things like loops going haywire. I tried reducing the queue length of the second command queue to zero but that doesn't appear to work, and its not clear that would affect metacommands anyway.
I did try the M598 method and it worked for one toolchange, only to end up in near-catastophe on the second change when something went wrong with the queue and a G0 command was executed at an insane feedrate above all preset limits causing a massive skip.
I hope I am just missing something here and there is a way to disable the MMS system because I'm at my wits end here. Any thoughts?
Cheers!
-
To provide an example here, my slicer produces normal slicer gcode, and then is set on a toolchange to call
M98 P"mmu/lib/print-filament-change.g" L1 ; the slicer populates the L # based on the next filament
Print Filament Change.g then looks like this at the start:
; This is the macro to call in your slicer when material needs to change M400 var nextLane = -1 if !exists(param.L) M291 S5 R"Manual intervention" P"Please provide lane to change to" set var.nextLane = input else set var.nextLane = param.L if var.nextLane < 0 || var.nextLane > global.mmu_slot_max M25 ; PAUSE abort "Invalid lane provided" if var.nextLane != global.mmu_selector_pos echo "Parking for filament change" M98 P"wiper/lib/wiper-park.g" ;servo move then a big G0 to the parking position M98 P"mmu/lib/check-loaded-state.g" ;gets two GPIO pins detecting filament and translates to a the loaded state global if global.mmu_loaded_state = 3 echo "Unloading" M98 P"mmu/lib/unload.g"
That unload command is the first place where things go wrong:
;unloads the filament path M98 P"mmu/lib/check-loaded-state.g" ; first case is something is loaded in toolhead, either properly or slipped past the selector sensor ; either way, we want to clear the toolhead if global.mmu_loaded_state = 3 || global.mmu_loaded_state = 1 ;sanity check the temperature and make sure we are at the current lane's temperature if global.mmu_lane_temp[global.mmu_selector_pos] <= 180 abort "Unload.g: Cannot unload due to temperature." G10 P{global.mmu_tool_number} S{global.mmu_lane_temp[global.mmu_selector_pos]} ;set tool temp M116 P{global.mmu_tool_number} ;wait for tool head temp to be correct G1 E{-global.mmu_ram_length} F{global.mmu_extruder_fast_speed} ;retract from hotend and wait to cool a bit M400 G4 P5000 ;unload to filament sensor by bumping then checking sensors var unload = global.mmu_extruder_unload while var.unload > 0 G1 E{-global.mmu_filament_bump} F{global.mmu_extruder_load_speed} M400 M98 P"mmu/lib/check-loaded-state.g" if global.mmu_loaded_state = 2 break set var.unload = var.unload - global.mmu_filament_bump if global.mmu_loaded_state = 3 || global.mmu_loaded_state = 1 abort "Unload.g: Could not unload toolhead"
That last abort there is where my macro is failing out before my physical printer has even finished the layer it is printing... If you could it up, that's about 2 servo moves, 3 gantry moves and at least 10 extruder moves that are supposed to happen before it gets to that conditional.
if I replace that first M400 in with an M598, then it gets further in the stack... in this way I added M598s before all conditionals to attempt to workaround this problem, but the behaviour is still extremely strange, and doesn't explain the high-feedrate move...
-
@SumoSniper I haven't thoroughly checked your code (I've only gone by the description in your first post) but I think you're bumping into this problem:
https://docs.duet3d.com/en/User_manual/Reference/Gcodes#command-queueing- If a command that is usually queued contains a parameter that is an OM expression enclosed in { } then the command is not queued because the value of the OM expression is liable to change, and there isn't a suitable context to evaluate it in if it were to be queued.
Ian
-
@droftarts Thanks for the suggestion - on reading the link I'm not sure that's my issue.
"Not queued" in this case in my reading would mean that effectively the queuing stops because the firmware needs to wait to be able to evaluate the expressions. So this could cause a delay, as it needs to be processed right before being executed, but it doesn't explain metacommands being executed "early" as such. Unless the lack of queuing is causing a desync between the command and move queues, I would still be mostly suspect of the MMS being my culprit.
-
@SumoSniper said in Mutiple Motion Systems & Macro Conflicts in 3.5:
@droftarts Thanks for the suggestion - on reading the link I'm not sure that's my issue.
"Not queued" in this case in my reading would mean that effectively the queuing stops because the firmware needs to wait to be able to evaluate the expressions. So this could cause a delay, as it needs to be processed right before being executed, but it doesn't explain metacommands being executed "early" as such. Unless the lack of queuing is causing a desync between the command and move queues, I would still be mostly suspect of the MMS being my culprit.
I'm not so sure about this - I think this does explain why the meta commands are executed early due to MMS, if you assume that not queued means run immediately. You have File and File2 - File2 runs ahead of File as it isn't processing and waiting for movement commands in a system that doesn't use MMS. But it does process meta gcode commands, and because object model expressions can't be queued, they are executed immediately, which is what causes them to run 'ahead' of where you would expect.
How to work around this I'm not sure of yet, but it does seem like it creates a very big foot gun since it fundamentally changes how meta gcode works, with (currently?) no way to revert to the previous behaviour.
It seems to me like not having a way to disable MMS behaviour on machines that don't have multiple movement systems is going to cause some problems, given that now there's a concurrency problem whenever one writes meta gcode that will be executed from a print file.
-
@NineMile Correct, if not queued means run immediately then we would expect this behaviour but that would be... well weird to say the least. Some clarification from devs would be ideal.
I have created a simple test case that anyone should be able to run to reproduce this:
G21 ; set units to millimeters G90 ; use absolute coordinates G0 Z5 G0 X50 Y50 M400 G1 X150 Y50 F3000 ; this move should take 2 seconds G1 X100 Y50 F3000 ; this move should take 1 second M400 M98 P"abort-test-1.g" ;abort test macro checks to see if X=150, if not, aborts G1 X50 Y50 F3000 ; this move should take 1 second ** IN MY TESTING, THE NEXT MACRO RUNS DURING THIS MOVE AND FAILS OUT G4 P2000 G1 X150 Y50 F3000 ; this move should take 2 seconds G1 X100 Y50 F3000 ; this move should take 1 second M400 M98 P"abort-test-2.g" ;abort test macro checks to see if X=150, if not, aborts G1 X50 Y50 F3000 ; this move should take 1 second G4 P2000 G1 X150 Y50 F3000 ; this move should take 2 seconds G1 X{global.abortpos1} Y50 F3000 ; this move should take 1 second M400 M98 P"abort-test-3.g" ;abort test macro checks to see if X=150, if not, aborts G4 P2000 echo "Did not abort" ; if we got here, then the M400 did its job and stopped the M98 being executed before the move finished
abort-test-1.g
if move.axes[0].machinePosition != 100 abort "Aborted at macro 1! MP: " ^ move.axes[0].machinePosition if !exists(global.aborttest1) global aborttest1 = true
abort-test-2.g
if move.axes[0].machinePosition != 100 abort "Aborted at macro 2! MP: " ^ move.axes[0].machinePosition if !exists(global.abortpos) global abortpos = 100
abort-test-3.g
if move.axes[0].machinePosition != 100 abort "Aborted at macro 3! MP: " ^ move.axes[0].machinePosition
Put the 3 .g files in your sys directory and then run the .gcode as a job.
In all cases, the machine does some slow moves between positions to end at X100 with a M400 command after. According to the MMS documentation, this should allow all moves to end before further execution, however the abort-test macros check for the machine position and abort if this position is not correct.
As commented in the code, my testing fails at test #2 over 8 seconds before the macro should have been executed. Interestingly, I was expecting it to fail already at abort no. 1. I included test 3 to try @droftarts theory but I never get there anyway...
-
@droftarts Is there any chance we can move this to the Beta Firmware forum, I think I misplaced it putting it here.
-
-
-
-
It looks like this thread might be a similar issue if that helps to combine efforts.
[3.5.0-rc.3] Conditional 'abort' command called unexpectedly
-
@dc42 @gloomyandy is there any chance we can get an M code to just switch the 2nd motion system processing off until this one gets figured out?
I posted that simple test code above that reliably brings up the problem.
-
Hi @SumoSniper ,
I'm facing similar issues, but with a more complex macro structure with submacros modifying and checking global variables. I haven't investigated too deeply but have been able to get around all issues so far by adding some M598's. First randomly at the start of every macro that is being called and at various points inside the macros. I settled on placing M598's after calls to macros that could modify a global variable that i will need later on and before i check/use a global variable that could have been modified by a macro.Due to time constraints i could not investigate too far or boil it down to test cases, but i figured it's because of the deferred queue. Now macros called from jobs behave just as theones being called when no job is running.
I hope to have some time to read up more on this topic at some point. It would also help if the documentation had some examples of what could go wrong for people using only 1 motion system but trip over some problems with macros, meta commands and queueing etc.
Hope my "advice" could help out
Cheers -
@SumoSniper @adambx I am currently reviewing how conditional GCode commands and multiple motion systems interact. Currently, global, set global and echo commands are only processed by the active file reader; but other commands including abort and the evaluation of conditions in if, elif and while commands is performed by both motion systems. I will change it to execute abort commands only from the active reader; however this will not solve the issue of using global variables in the evaluation of conditions.
Meanwhile, as a workaround it should be sufficient to place the following at the start of each affected macro, to make the inactive file reader skip all of it:
if state.thisInput != null & !state.thisInput.active M99
-
PS - the firmware binaries at https://www.dropbox.com/scl/fo/p0136wx04h8xf6ejwdnn9/h?rlkey=efrfwyb6o5tqid11gustz3uvy&dl=0 now include the change to execute abort only from the active file stream.
-
@dc42 Excellent thanks for that, the workaround should do it for me, will test it out soon.
No doubt picking what gets executed where is a bit of a pickle, similar to the challenges of refactoring code for multithreading.
I wonder if you might not save yourself a bunch of headaches by putting in an optional parameter on M98 that selects the motion system / reader you want to run it on? Then you could have either a default behaviour or a reserved constant you could drop into that so that you can perpetuate macro execution within the same motion system.
eg: A normal macro run is M98 P"foo.g" with optional "S" parameter where -1 is run on calling thread, 0 is run on ALL, 1+ is run on that motion system. Then default could be be -1 which would give the most consistent results for 99% of users.
-
Thanks a lot. Will try it out!
-
@dc42 I'm getting an error with this code:
if state.thisInput != null & !state.thisInput.active M99 Error: line 8 column 31: meta command: reached primitive type before end of selector string
The position is pointing to the ! on the second selector which is a bit confusing...
edit: I tried to diagnose more:
28/02/2024, 12:03:16 pm echo state.thisInput != "null" Error: at column 31: meta command: cannot convert operands to same type 28/02/2024, 12:01:37 pm echo state.thisInput 0 28/02/2024, 12:01:22 pm echo state.thisInput != null true 28/02/2024, 12:00:59 pm echo exists(state.thisInput) true
But in the object model on DWC it is showing as =null
Further edit: I put echos at the start of the relevant macro to see what was going on and the same outputs are happening - so state.thisInput exists and is returning 0 but not evaluating as null, so it can't find .active and fails out.
-
@SumoSniper please try this instead:
if !inputs[state.thisInput].active M99
state.thisInput
should return 2 when executed by the primary File stream and 12 when executed by the secondary File stream. It should return 0 when executed from the http stream. -
@dc42 What would it return when I execute it from daemon.g or a macro? Are those http-calls, too?
-
@o_lampe it would return true. The only reason it shows as null in the object model browser is that the fetch of the object model by DWC is not made in the context of any input. If you run:
echo inputs[state.thisInput].active
from the DWC command line then it will return true.
-
@dc42 Thanks! That one has fixed it, macros functioning as expected now. Interested to hear where the MMS metacommand handling goes when you've decided on that!
Cheers!
-
@SumoSniper @adambx @NineMile @o_lampe I have put new firmware binaries at https://www.dropbox.com/scl/fo/3y4agkmfzfmqcifuecqo3/h?rlkey=j0kibs1tubm5dfj7o2vz1vbzj&dl=0. The main difference is that multiple file readers are now only used if your job file or start.g file uses the new M606 command to fork the file reader (see https://docs.duet3d.com/en/User_manual/Reference/Gcodes#m606-fork-input-file-reader) . Therefore, unless you use that command, macros will only be executed by one reader, which should solve the problem.
If you do use M606 then the behaviour will be as it was previously from that point onwards, and you will need to ensure that any macros executed subsequently to that M606 command are suitable for execution by both file readers. For example, you can ensure that any global variables needed are set up before the M606 command executes. Please note, testing of the M606 command is still in progress.