[Freature Request] GCode to create a copy of a file
-
Hi
Is can you please add a gCode command to create a copy of a file? It can also be a
M471
with a flag that preserves the source file.Use case: When a print job is running I'm periodically flushing vars (via
echo >>
) to a file to create a restoration point in case of an OOM reboot, which happens quite often in my case. The problem is that sometimes the Duet reboots while the flushing process is still running and, thus, the restoration file gets corrupted. To handle this scenario I would like the keep a copy of the last successful restoration point.ATM, what I'm doing is the following:
- call
flush.g
that will createrestoration.g
@/sys
dir - move
restoration.g
into a/backup
dir, viaM471
that "deletes" the file from the/sys
- call
flush.g
again to create newrestoration.g
- call
-
@AndMaz Why not just leave the file where it is and simply change the name? so....
- Create the new file as restoration.new
- delete any file called restoration.old
- rename existing restoration.g as restoration.old
- rename restoration.new as restoration.g
With the above you should always have a valid saved state and you can recover it even if things crash during the above process.... - If restoration.g exists it will be valid so use it.
- If no restoration.g exists rename restoration.old as restoration.g and use it.
- if no restoration.g exists and no restoration.old exists (very unlikely) check for a valid restoration.new and use it.
A rename operation is likely to be far more "atomic" than write/copy operations.
Also have you worked out what is causing your OOM problem? It might be worth reporting it, it would be far better to avoid whatever is causing the issue rather than having to work around it. In particular if RRF is crashing while you are writing your restoration.g file then you have the additional risk of corrupting the underlying file system, which is almost certainly not good. Additionally all of this writing to the SD card will be creating wear on the card, best to not need to do it if you can.
-
@gloomyandy apologies for the delay, I missed you reply.
Thank you, I'll try your approach.
Also have you worked out what is causing your OOM problem? It might be worth reporting it, it would be far better to avoid whatever is causing the issue rather than having to work around it. In particular if RRF is crashing while you are writing your restoration.g file then you have the additional risk of corrupting the underlying file system, which is almost certainly not good. Additionally all of this writing to the SD card will be creating wear on the card, best to not need to do it if you can.
I agree with you, this is solution is far from perfect. Re the OOM problem, I'm not entirely sure what's causing it. It happens randomly so I cannot pinpoint the exact reason. While printing I'm constantly monitoring and collecting telemetry from the Duet (my arch can be seen in an old ticket that I've opened https://forum.duet3d.com/topic/31420/rr_reply-multiple-clients-same-machine )
In addition to printing and being monitored, the user might "ask" Duet to perform some additional tasks via HTTP API. I guess that this might be too much for Duet to handle and the OOM happens.
But it would be great to know exactly what's the scenario that causes OOM. Do you know a way of debugging this kind of issues?
-
@AndMaz said in [Freature Request] GCode to create a copy of a file:
But it would be great to know exactly what's the scenario that causes OOM. Do you know a way of debugging this kind of issues?
The key thing is to collect data when this happens, so the output from M122 as soon as possible after the OOM, along with a description of what was happening at the time.
As to your monitoring although this may be a good thing, remember that making frequent calls via the web interface will be adding additional load on RRF (and could in some circumstances be contributing to the OOM problem).
-
@gloomyandy I'm collecting telemetry every 10 secs so yeah, that definitely puts Duet under stress.
Re collecting the M122 data. Is there any docs for the
Software reset code
Error status
andAux0 errors
? I would like to understand the meaning of the reported vars=== Platform === Last reset 17:44:30 ago, cause: power up Last software reset time unknown, reason: User, Gcodes spinning, available RAM 2276, slot 0 Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x00400000 BFAR 0xe000ed38 SP 0x00000000 Task MAIN Freestk 0 n/a Error status: 0x00 Aux0 errors 0,0,0
Note This was taken just now and without any OOM