Duet 2 Ethernet WC 3.3.0 crashes, have to reset to reconnect
-
@Phaedrux
Wow this is not good.. Replacement board installed on the weekend. Did my first print today. PETG so no heat in the enclosure, 35 minute print job, finished, when into the other room pulled the bed out manually (been doing that since I got it as the manufacture said it is fine to do. removed the print, returned to desk and I am now disconnected. Checked m552 still had an IP, network module still flashing lights. connected to usb with octoprint, head bead still on as it should have been as I did not turn it off at the end of the job. Hit reset button, and after boot was back on the network. So what is the next step. Guess I will email Matterhackers and let them know still have the same problem?? Is there any more trouble shooting we can do. can not recreate problem on demand. -
It's pretty unlikely that the new board has an issue with ethernet as well. I think it's more likely that the issue is on the network hardware side. The Duet appears to still have an address and seeing activity.
What network hardware is between your PC and the Duet? Do you have any means of testing something alternative?
-
@phaedrux I have tested it the with the first card. Cable was sent directly from the failed duet to the switch in a different port. Remember it can be correct by pushing the reset button on the board, so how would that fix my network? I would like it to be a bad cable, but it is not, that was swapped out with the old board and no change. so far I have not had a second failure. I used it all day yesterday and today. Without being able to reproduce the problem on demand it is going to be very difficult to track down.
Are there any other logs that can be configured on the duet?
I had thought it was maybe triggered by heat, but when it happened with the new board, I was printing PETG. Today all Day had the enclosure running at 100F printing ABS with no failures.. I am willing to try anything.
Questions; If next time it fails, I remove the network cable from the car and plug in a different cable, running it directly to the switch and into an unused port, would you assume that eliminates my network?Currently the network path is; switch to jack, cable through wall to jack, Jack to 8 port switch which has the pi (octoprint) and the duet plugged into it. Booth switches are Dlink Unmanaged switches, a 24 port and 8port, DHCP is from the WAN router which is a Fios Gateway. I have many devices on the network with no issues on any of them.
Is there a way to ping the gateway from the USB connection using octoprint next time the failure occurs?
Also not sure on how the serial numbers work, but the first one was w09221 and the replacement is w09219. I am guessing they were built with all the same lot of parts and could wave to each other on the production line.
I just want to get to the root cause and correct it, be it local network or bad solder joint between the module and board etc.
After having 6 hard drives go bad in new IBM P950 enterprise servers servers within a week, I know even new enterprise class hardware can be manufactured with defects, and of course even technical users do make dumb mistakes, so please any other ideas on ways to track this next time it happens will be much appreciated!
Just thought of something the maybe I am doing wrong? My assumption was I can access the WC from multiple remote locations at the same time. For instance, I login from the browser on the Pi that is plugged into the same 8 port switch were the Duet it connected and sits next to the printer to load filament. I also connect from my PC in the other room. When the failure happens both browsers are disconnected. Now this is not what is causing my issue is it?? I mean I do this with the pi all the time when I was using octoprint so I assumed it is just a web interface and would not matter how many connections I make?
-
@Phaedrux OK in failed state after first print of the day finished. I ran a known good network cable from an open port of the 24 port switch directly to the duet.
The duet shows link, and shows activity light, switch end show link but no activity. Moved same cable from duet to pi, get link and activity at switch.Using the USB connection to Octoprint I disabled the networking on the Duet and then Enabled it setting the IP as it was prior. Refreshed the browser and after a few seconds got connected to WC!
So what does this tell you? Seems like software/frimware process died and needed to be restarted? Again I will ask is there a way to see a process table from USB or from other means I may have access to?
-
@airscapes I had my first board replaced. Unfortunately, this board is doing the same. "Connection lost, attempting to reconnect." I purchased a screen as I could not rely on the ethernet connection on the Duet 2. My machine is on, not printing, and I can see that I require a reboot to fix the behaviour. I was looking at the config.g file to determine what I might have to do if I upgrade to version 3.4.0. I have tried three different switches, several cables, and I get the same errors. This isn't what I expected. On long prints, the screen has been the only way to interact with the Duet 2. Once the network goes funky, cycling the power appears to be the only way to restore communication.
-
@trobison said in Duet 2 Ethernet WC 3.3.0 crashes, have to reset to reconnect:
@airscapes I had my first board replaced. Unfortunately, this board is doing the same. "Connection lost, attempting to reconnect." I purchased a screen as I could not rely on the ethernet connection on the Duet 2. My machine is on, not printing, and I can see that I require a reboot to fix the behaviour. I was looking at the config.g file to determine what I might have to do if I upgrade to version 3.4.0. I have tried three different switches, several cables, and I get the same errors. This isn't what I expected. On long prints, the screen has been the only way to interact with the Duet 2. Once the network goes funky, cycling the power appears to be the only way to restore communication.
@trobison Have you tired to connect to the USB port with a terminal program, send M552 S0 to disable network, then send 552 S1 Pxx.xx.xx.xx to enable network and set your IP? This worked for me yesterday after the connection dropped. The network came back, never hit reset and worked the rest of the day. I would also like to know what the serial number is on your board? There is a sticker on the box in came in and one on the CPU on the board. If you already exchanged it I would assume you know that. Seems like something crashes in the network module if my workaround continues to enable the connection after failure? I am lucky, as I was using octoprint with the printers for years before this "Upgrade" . I must say, I have grown to like the DWC interface as much if not more than octoprint for what I do. Please send your issue to your reseller if you have not already done so and ask them to forward your issue to Ronald@duet3d.com as he is the person Matterhackers forwarded mine to.
Would be good from them to know there are multiple customers with the same issue if they don't already know about it.
Thanks for posting and validating this is a real problem unrelated to external network issues. -
We are trying to work out a way to get you a replacement ethernet module to test with. Thanks for your patience.
@trobison Do you have a thread of your own discussing your issue already?
-
@phaedrux No worries, this is really a weird one. My first board is on it's way back to Matter Hackers, should be there by now so not sure how long it will take to get back to manufacture location. That one failed much more often than this one.
I have been printing ABS parts for the past 2 days with as much as a blip.. Wish I knew what triggered it, but glad I have a work around that does not involve resetting the board since the print job is not interrupted in any way. -
No, I did not create a thread. I tried connecting via USB and resetting the network. I still have the same issues with network drop outs. The chip has a sticker of DUET2 Main W06367. This morning I powered on the printer and attempted to print. I have issues loading files to the printer, the network drops out. Nic lights are correct, but the connection drops randomly. It faults more while printing.
13/02/2022, 09:36:19 Failed to upload Desk_Stand.gcode Could not complete action because the connection has been terminated 13/02/2022, 09:36:19 Connection interrupted, attempting to reconnect... Network error 13/02/2022, 09:35:57 Successfully deleted TESTPRINT.gcode 13/02/2022, 09:31:02 Finished printing file 0:/gcodes/TESTPRINT.gcode, print time was 0h 48m 13/02/2022, 09:31:00 Connection established 13/02/2022, 09:31:00 Connection interrupted, attempting to reconnect... Operation failed (Reason: Service Unavailable) 13/02/2022, 09:10:37 Connection established 13/02/2022, 09:10:36 Failed to upload Print-In-Place Phone holder.gcode Could not complete action because the connection has been terminated 13/02/2022, 09:10:36 Connection interrupted, attempting to reconnect... Network error 13/02/2022, 09:08:25 Successfully deleted 2 items 13/02/2022, 08:40:55 Cancelled printing file 0:/gcodes/TESTPRINT.gcode, print time was 0h 5m 13/02/2022, 08:40:46 Resume state saved Warning: Obsolete use of S parameter on G1 command. Use H parameter instead. 13/02/2022, 08:39:01 Connection established 13/02/2022, 08:39:00 Connection interrupted, attempting to reconnect... Operation failed (Reason: Service Unavailable) 13/02/2022, 08:37:36 Connection established 13/02/2022, 08:37:36 Connection interrupted, attempting to reconnect... 13/02/2022, 08:35:33 M32 "0:/gcodes/TESTPRINT.gcode" File 0:/gcodes/TESTPRINT.gcode selected for printing 13/02/2022, 08:35:25 Upload of TESTPRINT.gcode successful after 5s 13/02/2022, 08:35:02 Connected to 192.168.10.60
13/02/2022, 09:54:06 m122 === Diagnostics === RepRapFirmware for Duet 2 WiFi/Ethernet version 3.3 (2021-06-15 21:44:54) running on Duet Ethernet 1.02 or later + DueX5 Board ID: 0JD0M-9P6M2-NWNS0-7J9DJ-3SJ6S-K90RJ Used output buffers: 12 of 24 (24 max) === RTOS === Static ram: 23876 Dynamic ram: 73712 of which 0 recycled Never used RAM 13972, free system stack 98 words Tasks: NETWORK(ready,29.2%,231) HEAT(delaying,0.1%,330) Move(notifyWait,0.2%,307) DUEX(notifyWait,0.0%,24) MAIN(running,70.4%,441) IDLE(ready,0.1%,29), total 100.0% Owned mutexes: === Platform === Last reset 00:12:46 ago, cause: power up Last software reset at 2022-02-12 17:20, reason: User, GCodes spinning, available RAM 16828, slot 2 Software reset code 0x0003 HFSR 0x00000000 CFSR 0x00000000 ICSR 0x0041f000 BFAR 0xe000ed38 SP 0x00000000 Task MAIN Freestk 0 n/a Error status: 0x0c Aux0 errors 0,1,0 Step timer max interval 0 MCU temperature: min 34.1, current 37.9, max 38.0 Supply voltage: min 23.9, current 24.0, max 24.2, under voltage events: 0, over voltage events: 0, power good: yes Heap OK, handles allocated/used 99/1, heap memory allocated/used/recyclable 2048/86/0, gc cycles 0 Driver 0: position 15650, ok, SG min/max 0/1023 Driver 1: position 754, ok, SG min/max 0/1023 Driver 2: position 4627, ok, SG min/max 0/1023 Driver 3: position 3641, standstill, SG min/max not available Driver 4: position 0, ok, SG min/max 0/1023 Driver 5: position 0, standstill, SG min/max not available Driver 6: position 0, standstill, SG min/max not available Driver 7: position 0, standstill, SG min/max 0/56 Driver 8: position 0, standstill, SG min/max not available Driver 9: position 0, standstill, SG min/max not available Driver 10: position 0 Driver 11: position 0 Date/time: 2022-02-13 09:54:06 Cache data hit count 4294967295 Slowest loop: 175.42ms; fastest: 0.14ms I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0 === Storage === Free file entries: 9 SD card 0 detected, interface speed: 20.0MBytes/sec SD card longest read time 1.2ms, write time 50.3ms, max retries 0 === Move === DMs created 83, maxWait 291747ms, bed compensation in use: mesh, comp offset 0.000 === MainDDARing === Scheduled moves 2259, completed moves 2229, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 1], CDDA state 3 === AuxDDARing === Scheduled moves 0, completed moves 0, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 0], CDDA state -1 === Heat === Bed heaters = 0 -1 -1 -1, chamberHeaters = -1 -1 -1 -1 Heater 0 is on, I-accum = 0.2 Heater 2 is on, I-accum = 0.5 === GCodes === Segments left: 1 Movement lock held by null HTTP is idle in state(s) 0 Telnet is idle in state(s) 0 File is doing "G1 X81.698 Y83.929 E304.1846" in state(s) 0 USB is idle in state(s) 0 Aux is idle in state(s) 0 Trigger is idle in state(s) 0 Queue is idle in state(s) 0 LCD is idle in state(s) 0 Daemon is idle in state(s) 0 Autopause is idle in state(s) 0 Code queue is empty. === DueX === Read count 1, 0.08 reads/min === Network === Slowest loop: 202.93ms; fastest: 0.02ms Responder states: HTTP(0) HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0), 0 sessions HTTP sessions: 2 of 8 Interface state active, link 100Mbps full duplex
-
@trobison Just to ensure we've covered all the bases, do you have another SD card to test with?
-
@phaedrux This is a replacement card. I went through the online configurator to get the most recent stable version. Do you require another SD card? If so, can I image the existing card? There are a lot of files for the tool-changer.
After a reboot, I was able to upload some jobs straight away. I used the screen to submit the jobs without issue. The network issues are random. Somethings worse, other times it's good. -
@trobison Switching to a new SD card is simple matter of copying all the folders and files across. Provided the old card still works.
-
@phaedrux I prepared a new SD Card (SanDisk Extreme). I copied all files to the new card, and the Duet booted. I left the machine on for 6 hours, and recorded only one disconnect. Then a test print was sent to the printer, which
completed without error. The machine was powered off after that. The following day, I turned on the printer, No SD Card found. I took the card out and reinserted it. Same response. I put the card into a card reader and I could see all contents. I replaced the card with the SD card I had earlier (this is a replacement card as well). I still get network drop outs. Network cables have been changed, and the switch has been changed. Having the DuePanel has proved useful for this issue. -
@trobison Do any of the chips on the ethernet module get extremely hot to the touch?
-
@phaedrux I have a little more diagnostic info. During the last print I tried to make changes to my toolchanging routines. The network drop outs prevented this during printing. After the print I could make some changes but still having random drop outs. I open up a new Enterprise HP Switch as a test. There are only two ports in use on this 24 port switch. While working on tpre0.g the network dropped and cleared all contents in the file. This has happened 6 times today; changing the switch and network cables have not had an effect. Cycling the DUET is the only way to return comms. Thank goodness I have a DuePanel to see what is going on. It is becoming challenging to use my ToolChanger.
-
I photographed the network chip with my thermo camera - it reports 53 degrees. The network has failed minutes into a print. I can't see the webpage. "Connection lost" I swapped switches back, and it is not available on this switch as well. I can see the printing status on the DuePanel. There are pauses in the print to insert nuts, but I can't locate layer lines on the panel. Hopefully it pops up a message on the screen of the duet.
-
@Phaedrux I have been printing items every day this week and have yet to experience a third connection drop with the new board. I am fairly sure the issue has not magically resolved but nothing new to report at this time. Have not heard back from anyone on the Warranty department as of yet. Hopefully once the first board gets back it can be analyzed.
-
@airscapes @trobison Can you both send an email to warranty@duet3d.com with your name and address? We will send out a replacement ethernet module.
-
@phaedrux Sure is it replaceable without soldering or will the board need to be removed to R&R the module? Truthfully never looked that close to see how it was installed.
Hum... looks to just plug into headers.. maybe should have done and R&R on the first board that was acting up constantly.. @trobinson, have you done this on your board? Just removed it and reinstalled it to see if there was any change?
-
Yes it's a small module that can be removed.
Perhaps as part of the exploration could you grab some close up shots of it? Perhaps there is a solder point that is suspect.