Question about the quality of the Duet software..

natthapol.v

Hi, I've experienced in software testing in the Automotive Industries before.
Upon looking at the source code on Github, I've not seen any information or anything related about testing on Duet firmware.

Recently (v3.4+), I've experienced somewhat huge amount of software issues that weren't there in the previous betas, but surprisingly introduced in the newer beta/RC versions. That is why I'm starting to question about the quality of the firmware itself with regards to the implementation of the new features.

Duet firmware is to me seems like a reasonably sized project with demand in perfectly executed software features and execution order. Imagine people use this software on their CNC mill, or plasma cutter and suddenly the software behave unexpectedly and led to a crash on CNC or fire hazard on the plasma cutter. However I've seen none of the basic set of software tests perform on the project itself.

I'm just trying to hint out that the longer this thing released untested with increasing number of new features, the more messy and problematic and unreliable it will become in the future. Do not count on real world user to test software and report bug to you. People will tend to move on and find alternative that ensure the expected quality.

From what I've observed on the Github commit activities, the firmware is mostly a one man show who does almost all the implementations (1 man for firmware, 1 man for DWC, 1 man for PanelDue, 1 man for DSF, etc. From my experience, this has led to several project downfall and abandonware in the Automotive industries. OEM that experienced this kind of problem will mostly find a suppliers/subcontractors to overtake this deep pile of shit from them, and let the cheaper labor ensure the software quality for them instead. Recall is very costly in the industry, and I've seen a recall from a software module that was tested once and everybody thinks that retest in unnecessary because the module was untouched between software releases. Bugs begin to show once this module in integrated with another module in the system.

Each software feature/functionality deserves, at least, its own unit tests, that must be executed prior to release. I've not even move up to the Integration tests or even System tests that in usually perform sequentially to ensure quality in the proper software release cycle.

You have written up well enough System/Architecture requirements in Dozuki, but lack of traceability and the verification step to ensure the requirements are fully met. I think starting with well written requirements and a test project on Github that could be linked to the intended requirements will be a great starting point.

V model software development

oliof

@natthapol-vanasrivilai I would suggest before you start with software development 101 you look up the developers background who you are trying to lecture here. @dc42 is also behind eschertech.com, which has a formal C++ verifier in its product portfolio. If you do more than a glancing view on the github repo, you see that the code is at times adjusted to allow for the verifier to run.

Formal verification may not be a common way to ensure software quality, but it's an approach that can cover a lot of issues that unit testing and integration testing cannot, or can only do at very high cost.

You should also consider that automotive control systems usually apply to systems that have a (mostly) fixed configuration, while RepRapFirmware does allow free-form reconfiguration of the system. I posit that no amount of unit and integration testing will be able to catch all errors in such a system. They also have hundreds of millions of monetary units in development budget which allows for large and cumbersome software development processes to be run.

As such, the release of beta and release candidate systems -- which by definition aren't supported deemed production ready releases are another way to gather more input about new changes, their performance, bugs, unintended side effects, etc. One of the reasons that 3.4 is still not in release is exactly that there are known issues that need to be analysed, the root cause found, fixed, tested, integrated, and released.

PS: I've seen "reasonably sized" software with unit tests and a code coverage quotient well beyond 90% in my 20+ years career in IT that simply didn't work in practice. So I completely disagree with you that the existence or absence of unit tests allows any guess about the quality of the software under test.

dc42

@natthapol-vanasrivilai, thanks for your thoughts.

As @oliof said, I am well aware of the type of development process used in aerospace and some other fields of safety-critical software development (though sadly not always in the automotive industry, as seen from the legal cases faced by Toyota and others). Even now, car recalls for safety-related software updates are fairly common - and who knows how many OTA updates Tesla has pushed for safety-related reasons.

I would love to have an independent V&V team working to validate RRF, DSF and DWC. If everyone using our software was prepared to pay a few £1000s for the privilege plus several £100s in annual maintenance fees, then we could no doubt afford such a team. However, we wouldn't have got where we are today if the software wasn't FOSS. Unfortunately we have to fund software development entirely from profits made on selling Duet hardware - and increasingly, RRF is run on cloned Duets (which we're not keen on, but is allowed because Duet hardware is also open source) and on non-Duet hardware (which we regard as a good thing - in particular, Team Gloomy has alerted us to a number of bugs in new code before we found them ourselves). So we have to make the best of a limited budget.

Regarding unit testing, this is useful for some types of modules, especially where the functionality is complex. DSF does have some unit tests, see https://github.com/Duet3D/DuetSoftwareFramework/tree/master/src/UnitTests. For RRF my preference is to perform formal verification of module properties. We're not there yet because the available tools do not yet cover some of the additions to the C++ language in the last 10 years that RRF uses; but we're working on it. Nevertheless, some critical modules have been formally verified; for example the fast 62-bit integer square root algorithm that RRF used to compute motion control up to and including version 3.3, and the table search function used to calculate PT100 and PT1000 temperatures. Many more modules have precondition and other annotations to facilitate formal verification. I hope that we will soon be able to perform full unit-level formal verification of RRF.

Looking at the bugs found in RRF 3.4.0RC1, these are listed in the "Bug fixes" section of the draft release notes for RC2 at https://github.com/Duet3D/RepRapFirmware/wiki/Changelog-RRF-3.x-RC#reprapfirmware-340rc2. Of these:

#1 could possibly have been detected by an appropriately-constructed unit test, and certainly by a system test
#2 was caused by a change made by a third party, so out of our control
#3 would have needed a system test, plus additional specification of how accurately each component of the object model should report the underlying value
Likewise for #4. In addition this only occurred with the combination of a rarely-used feature (use of M301 to override the model-generated PID parameters) and the P value being unusually low.
#5 would only have been revealed by testing the changed functionality on all PWM-capable ports of the Duet 3 Mini. I tested it on one PWM-capable port on a Mini 5+ but that port wasn't IO1.
#6 and #8 I did not have a suitable machine to test on. I currently have five 3D printers of which I test on four, plus two bench systems that I can reconfigure, but I did not have either of them set to a configuration required to detect these issues. Some other members of our team did have suitable machines, but none were available to run tests. The new feature causing #6 and #8 had already been released in beta7, so I was hopeful that any associated bugs would already have been reported. In retrospect I should perhaps have delayed the RC by several days to allow more tests to be run.
#7 should have been detected by a system test.

So one of these issues might have been detected by a unit test, and the remainder only by system tests. This confirms my view that any additional testing we do would be best conducted at the system level. However, RRF supports a huge variety of systems, so we will only ever be able to test a tiny subset of the system space that RRF can be used in. In contrast, the software for a car only needs to support a few different trim levels, most of which are subsets of the top trim level.

My ideal would be to construct formal models of many system configurations, and run them to verify the required system properties. I know this is possible in principle, because I had a plan to develop tools to do this nearly 20 years ago. This would be a lot more cost-effective than setting up many different systems; but there could still be subtle timing issues that the model didn't capture, so it wouldn't be perfect.

Given the constraints we operate under, and that RRF 3.4 has (from the release notes) 86 new/changed features and 68 bug fixes since RRF 3.3 including major changes to the motion system to support input shaping, IMO recording only 8 new bugs in the first RC is quite good. Of course if we could afford to have an independent V&V team execute a lengthy test procedure on a lot of different machine configurations, we could do better.

Regarding the nature of our team: over 40+ years of software development working for several companies, I have learned that a small team of highly-competent developers with the right teamwork approach can accomplish the same productivity and the same or better code quality than about ten times the number of average software developers, provided the project is not so large that too many developers are needed. The investment banking companies were well aware of this - I worked for one for a while, and the number of interview candidates with Oxbridge PhDs we rejected was amazing!

alankilian

@dc42 said in Question about the quality of the Duet software..:

If everyone using our software was prepared to pay a few £1000s for the privilege plus several £100s in annual maintenance fees, then we could no doubt afford such a team.

I've only ever bought one Duet2 and since it's likely to keep running for the rest of my printer's life I won't be sending any additional funds your way.

Do you have a way for those of us with some extra folding money to be able to support your work financially through a donation once in a while?

mikeabuilder

I second @alankilian's suggestion. At a minimum, knowing if there is a local pub where happy users could pay off part of the company tab.

And regarding testing, maybe there is a process whereby users could volunteer to run some non-destructive tests on our machines to contribute to a larger data set? Maybe even contribute some modules?

Phaedrux

@alankilian @mikeabuilder
Being active on the forums to help new users is immensely helpful. As is running beta and RC releases when available and providing feedback.

zapta

@dc42 said in Question about the quality of the Duet software..:

I worked for one for a while, and the number of interview candidates with Oxbridge PhDs we rejected was amazing!

Are you a Cambridge PhD yourself?

natthapol.v

@oliof @dc42
Thank you very much for both of your responses. I'm just personally doubting about the quality and the test coverage of the more recently software releases. I know that a whole team of devs/testers is needed with enough funds to feed into the development process. I'm not blaming anyone here and also not trying to say who is not competent enough to develop good software.

V3.4 is still under development cycle and a release candidate was just recently released. Release candidate is from my point of view, should be the most recent stable version with zero new features allowed to be introduced to the software cycle. Only intensive testing and bug fixing should be focused at the current state onwards. Maybe you guys have already done enough testing that wasn't documented on Github. I don't mind If the current development process already fits your team workflow. It's really hard for a small team to develop really stable software, It's also hard for one man to keep track bugs that was reported, while also bringing new features to the software at the same time.

Dev hates testers for how annoying and picky they can be.
Testers also hate it when dev made errors on a simple task. No human is perfect even though you thought so.
If the dev has to test his own software, maybe they will find a few bugs here and there, but not much when a 2nd/3rd person looking at it against the written requirements.

Automotive ECUs are fairly flexible as well, with homologation laws and multiple variants of car configuration possible. The same ECU will also go into several car brands and models in its parent group. Some control units from Audi will also land in VW, Porsche, Lamborghini, Bentley, Skoda, etc. Very flexible indeed. They were still able to ensure great software quality in their system as well.

System test are very costly and time consuming to implement and run. But skipping the earlier tests in the V-chain is not gonna help uncover the bug in an isolated environment. At least some tests are better than no tests at all.

I like Duet ecosystem a lot and own genuine board. All the printers that I've built were designed to run Duet from the very start. Never plan to made a switch to Klipper or anything else similar in the near future. I'm glad to hear that further testing are planned in the near future.

oliof

@natthapol-vanasrivilai I am doing my part testing releases on the machines I have, as do many others here. Other people provide documentation fixes or their own code contributions such as the hangprinter people.

You say the project should now focus on improving code quality. That's exactly what the RC stage of a software release cycle is for.

Since you profess experience and knowledge about testing software, and this is an open source project, you could easily contribute to the project by bringing in your expertise. @dc42 has taken time out of his day to address this and provided pointers where he believes unit tests might have helped avoid issues, so you know exactly where to start and prove your claims by doing your part (-:

dc42

@natthapol-vanasrivilai said in Question about the quality of the Duet software..:

Maybe you guys have already done enough testing that wasn't documented on Github.

We don't document testing on Github.

I don't mind If the current development process already fits your team workflow. It's really hard for a small team to develop really stable software, It's also hard for one man to keep track bugs that was reported, while also bringing new features to the software at the same time.

We had planned to migrate the enhancement and issues list to Github Issues prior to the 3.4 release, but are still waiting for one of us to have time to migrate them all, as there's not much point in migrating just some of them. In fact the current issue tracking spreadsheet works quite well for us, although of course it doesn't allow others to track what we are doing.

Dev hates testers for how annoying and picky they can be.

Actually I like picky testers! I'm picky myself.

If the dev has to test his own software, maybe they will find a few bugs here and there, but not much when a 2nd/3rd person looking at it against the written requirements.

I agree, independent V&V is best.

System test are very costly and time consuming to implement and run. But skipping the earlier tests in the V-chain is not gonna help uncover the bug in an isolated environment. At least some tests are better than no tests at all.

Most of the serious bugs we find in RRF are system-level bugs, and only found by testing on a real 3D printer or at least a bench system.

I like Duet ecosystem a lot and own genuine board. All the printers that I've built were designed to run Duet from the very start. Never plan to made a switch to Klipper or anything else similar in the near future. I'm glad to hear that further testing are planned in the near future.

Thanks!

dc42

@zapta said in Question about the quality of the Duet software..:

Are you a Cambridge PhD yourself?

Yes. My PhD thesis went online recently, https://www.repository.cam.ac.uk/handle/1810/331159.

T3P3Tony

@phaedrux said in Question about the quality of the Duet software..:

@alankilian @mikeabuilder
Being active on the forums to help new users is immensely helpful. As is running beta and RC releases when available and providing feedback.

I would like to second this. It is immensely helpful for people to test Beta and RCs on their machines!

zapta

@dc42 said in Question about the quality of the Duet software..:

Yes. My PhD thesis went online recently, https://www.repository.cam.ac.uk/handle/1810/331159.

Very interesting. I expected something related to computing. Maybe the Duet IR sensor has more theoretical depth than we expected.

Getting a PhD and being super productive in development of real life software applications are two different skills. You seems to have both. As your fellow British Ali G. says, Respect!

deckingman

@zapta Just for info, DC mentioned "Oxbridge" graduates. This is peculiar term which might not be apparent to non-native English speakers, but it means graduates of either Oxford or Cambridge universities.

dc42

@zapta said in Question about the quality of the Duet software..:

Getting a PhD and being super productive in development of real life software applications are two different skills. You seems to have both. As your fellow British Ali G. says, Respect!

Thanks! To be honest, the most useful skill that I learned while doing my PhD was now not to do systems engineering.

zapta

@deckingman said in Question about the quality of the Duet software..:

it means graduates of either Oxford or Cambridge universities.

Thanks @deckingman. This explains why when I searched for Oxbridge University I got this

gnydick

@natthapol-vanasrivilai I have to wonder what your background is to be able to comment on software development practices when you mention that it's difficult for one man to keep track of bugs. There are tools for that and it should be trivial for him with his amount of experience.

zapta

@natthapol-v said in Question about the quality of the Duet software..:

From my experience, this has led to several project downfall and abandonware in the Automotive industries.

The automative industry uses primitive and ad-hoc processes compare to the aero-space industry, hence the often recalls

Duet and RRF would be better off adopting the superior quality standards of the aerospace industry since they deal with a similar problem, controlling a reliable movement in a three dimensional space.

Seriously, want more resources invested in Duet/RRF development? Buy more boards.