RRF 3.01-RC8/DWC 2.1.3/DSF 2.0.0: Mid-print hanging
-
Indeed. The first time it happened I had a panic because I thought it had completely died, then suddenly it spring back into life. I'm also finding that the period it stops for is fairly random too, though it never seems to be more than ~5s or so.
-
Now this is interesting... Just got this message in conjunction with a hang:
[info] System time has been changed
-
Do the hangs perhaps occur when the NTP client updates the system time? Do the hangs stop if you disable the NTP process?
-
If the hangs are something to do with the NTP updates I'm curious to know why it only affects the latest release.
-
I've found a potential deadlock which I'm about to fix in DSF 2.1.0. I am wondering if it is related to the delays you've been seeing.
@gtj0 In your debug log there was a note saying "last transfer > 1s ago" which is quite unusual. What is the normal CPU load of your board while printing? If it is quite high, DCS tends to hang for a moment so perhaps we should increase the default priority of DCS in the systemd service file and check if that improves things.
PS: I tried to reproduce the hangs with the default priority, 100% CPU load, and the upcoming DSF 2.1.0 and I don't seem to get any more lags.
-
This bug has resulted in this release being unusable for me now as it causing large blobs to form when it pauses which results in the head catching and knocking parts off the bed.
For now I've reverted to the RC7 setup until the next release is out -
@chrishamm said in RRF 3.01-RC8/DWC 2.1.3/DSF 2.0.0: Mid-print hanging:
I've found a potential deadlock which I'm about to fix in DSF 2.1.0. I am wondering if it is related to the delays you've been seeing.
@gtj0 In your debug log there was a note saying "last transfer > 1s ago" which is quite unusual. What is the normal CPU load of your board while printing? If it is quite high, DCS tends to hang for a moment so perhaps we should increase the default priority of DCS in the systemd service file and check if that improves things.
PS: I tried to reproduce the hangs with the default priority, 100% CPU load, and the upcoming DSF 2.1.0 and I don't seem to get any more lags.
I think increasing the priority is a good idea. Philosophically, I'd consider the DCS a "real-time" process. I do usually have Chrome running on my Jetson Nano displaying a video stream but that only results in about a 25-30% CPU load and I have tried it without the video stream running but still got the pauses.
I'll keep an eye out for DSF commits and test them as they become available.
-
@dc42 said in RRF 3.01-RC8/DWC 2.1.3/DSF 2.0.0: Mid-print hanging:
Do the hangs perhaps occur when the NTP client updates the system time?
I thought maybe but...
Do the hangs stop if you disable the NTP process?
No. That would have been too easy.
@chrishamm Is there something in the DCS synchronizing the time between the SBC and the Duet?
I'm wondering why the DCS prints that message in the first place. -
@gtj0 That message pops up when the periodic updater task in DCS notices a time shift (> 5s) after a predefined delay. When it notices that, it thinks the system time has been changed and sends an update to the Duet.
-
@gtj0 said in RRF 3.01-RC8/DWC 2.1.3/DSF 2.0.0: Mid-print hanging:
I think increasing the priority is a good idea. Philosophically, I'd consider the DCS a "real-time" process. I do usually have Chrome running on my Jetson Nano displaying a video stream but that only results in about a 25-30% CPU load and I have tried it without the video stream running but still got the pauses.
While I agree with the idea of increasing the priority as I would also see DCS as a real time process, I wouldn't encourage it as a 'fix' for this hanging issue. Having moved back to 1.3.2 over an hour ago I've not experienced a single pause, whereas previously I was getting 10+ second pauses every 10/15 minutes.
I'm keen to try DSF 2.1.0 to see if the issue @chrishamm found was the cause. -
@ChrisP said in RRF 3.01-RC8/DWC 2.1.3/DSF 2.0.0: Mid-print hanging:
@gtj0 said in RRF 3.01-RC8/DWC 2.1.3/DSF 2.0.0: Mid-print hanging:
I think increasing the priority is a good idea. Philosophically, I'd consider the DCS a "real-time" process. I do usually have Chrome running on my Jetson Nano displaying a video stream but that only results in about a 25-30% CPU load and I have tried it without the video stream running but still got the pauses.
While I agree with the idea of increasing the priority as I would also see DCS as a real time process, I wouldn't encourage it as a 'fix' for this hanging issue. Having moved back to 1.3.2 over an hour ago I've not experienced a single pause, whereas previously I was getting 10+ second pauses every 10/15 minutes.
I'm keen to try DSF 2.1.0 to see if the issue @chrishamm found was the cause.Oh yeah. I wasn't suggesting it as a "fix". Just a future preventative measure.
-
@chrishamm said in RRF 3.01-RC8/DWC 2.1.3/DSF 2.0.0: Mid-print hanging:
@gtj0 That message pops up when the periodic updater task in DCS notices a time shift (> 5s) after a predefined delay. When it notices that, it thinks the system time has been changed and sends an update to the Duet.
Ok so a symptom not a cause then.
-
Thought I was crazy but am having the exact same problems. Worked great with RC7, seen it pause for 5-10 seconds three times with RC8.
-
@chrishamm Not fixed with DSF commit d46965d
and RRF commit 56c8e1d583d7f17e5c4266c7ce8cc227d40e3bdd (SBC improvements)Still hanging for many seconds. No messages in DCS log.