Can we have a revised release process?

gnydick

@elmoret to do what I described, I could implement it in a few days, provided I was brought up to speed on the hardware. It's quite possible we're envisioning different scope and scale.

I think it's more important to have a test harness that at least covers each g-code and regressions.

I agree, Unit tests are always a PITA. But if it were my job, I would be embarrassed to have certain bugs slip out that are the equivalent of forgetting to make sure your servers' disks don't fill.

elmoret

@gnydick said in Can we have a revised release process?:

@elmoret to do what I described, I could implement it in a few days, provided I was brought up to speed on the hardware.

OK then. Here's the hardware:

1x https://www.mccdaq.com/usb-data-acquisition/USB-QUAD08.aspx
5x https://www.omc-stepperonline.com/Nema-17-Closed-Loop-Stepper-Motor-13Ncm184ozin-Encoder-1000CPR.html?search=encoder&sort=p.price&order=ASC

That covers all your steppers. Then you need a DAQ for DIO/AIO:

2x https://www.mccdaq.com/data-acquisition/low-cost-daq (the USB-200, specifically)

Two of the 8 channel DAQs would be plenty to cover fans, thermistors, endstops, heaters.

Tell you what - if you complete the project and dc42 finds it useful, I'll buy all the hardware back from you for original retail price, so you're only out the few days invested.

gnydick

@elmoret it'll take more than that to learn all of those parts. I'm not experienced with embedded. I don't have the time or money to learn a ton of new things. But I'd be happy to take APIs provided and demonstrate what I'm talking about.

Are they high level interfaces or would I have to learn a ton of stuff just to get those probes bootstrapped and recording, synced, etc?

Are there simulators? I've found the KiCad code, but have no idea how to use it.

Long story short, if my knowledge can be bootstrapped, I can help.

elmoret

Those products all have APIs and drivers already, you'd just call a function and get back stepper positions. They have example code you'd use to configure the DAQs. Here's documentation:

https://www.mccdaq.com/pdfs/manuals/Mcculw_WebHelp/ULStart.htm

And here's a specific example, reading an analog input (for checking state of fans and heaters, for example). I picked Python but they have examples in many programming languages:

https://github.com/mccdaq/mcculw/blob/master/examples/console/a_in.py

Not sure what you mean by simulators? KiCad is a PCB layout program, like Altium.

Btw: if you think RRF has bugs in stable releases, check out Prusa's firmware - and they have roughly 100x the staff/resources of Duet3D!

https://github.com/prusa3d/Prusa-Firmware/issues/1362

"We found out what was the problem caused by. There was an antient bug (related to those errors) which was hot fixed by limiting the possible temp at which the error can be displayed. We removed the limitation in order to prevent dangerous behaviour and forgot it was originaly a hot fix. It should be fixed in FW 3.5.1."

So they're not even commenting their code when they put in hot fixes. And then they released 3.5.1 apparently without fixing the issue.

Not saying anything is excusable/allowable, but just saying I'll take RRF over any of the alternatives any day.

bjdchwr created this issue in prusa3d/Prusa-Firmware

closed PREHEAT ERROR after printing is started. #1362

gnydick

@elmoret cool. I'll check it out. By simulator, I mean just that. There are simulators for all sorts of things. There are circuit simulators, from my college days there was SPICE, for example.

There are PCB simulators, as well. I just have no familiarity with the field.

gnydick

@elmoret that all looks high-ish level enough to work with, but it's obvious there's a great deal of background information needed.

So to try to make my ideas concrete, here's what I would do that could be done in a week.

First Pass - no motors, execution, etc.

Define all of the g-code instructions I want to test
Define all parameters to be tested for each g-code
Define the expected output in terms of just response from the board that it properly received and ingested the command properly. NOT the action taken by the board
I'm assuming there is only one handler needed initially for sending commands
For each type of output, write a function to read from that endpoint
Go back and categorize each g-code by which output endpoint is needed to read the result
Using the API's, write a test case executor that reads the inputs from your definition file, detects the category, calls the proper function to read the output, and compares it to the expected output

This would be very easily accomplished with the right background info about the ecosystem.

Second Pass
Same approach, but now we do the mechanical, electrical, etc. that those probe boxes afford.

I've done this many times.

Phaedrux

I'm all for improvements, and I'm not saying things couldn't be better, but I don't think you're being reasonable considering that the dev team in this case is a one man show with volunteer testers and a release cycle of maybe a month or two. As it stands, if you have a bug and report it, David is likely to have a bug fix release within days for you to try. An RC and a point release technically aren't the same thing, I agree, but for all intents and purposes, in this case it might as well be.

What exactly is the driver for this? Are you upset about the launch bugs of the maestro? When any new hardware comes in contact with a diverse user base there's going to be some bugs found. And fixes have been applied in a very respectable time frame. Does it make sense to delay 2.02 so that a point release for 2.01 can be issued with certain fixes but no new features?

And honestly, of all the things that could improve reprapfirmware usability, I don't think more released versions is one of them. People already have enough trouble keeping track, now you'd suggest having 2.01.1, 2.01.2 as well as 2.02 RCs?

If you want stable, stay on the latest full release. If you want the fix from an RC, evaluate the RC, and if it fixes what you want, run with it, if not, wait for the next full release, it's only a month or so away.

gnydick

@phaedrux you're basically saying, tough noogies, you don't get fixes to the current stable release. That's just not good practice.

I understand being resource constrained, but I also know how software development goes.

I don't think anyone would get confused, your not giving enough credit to people. It's reasonable to expect people to be able to tell the difference between something called latest stable and latest RC, no matter what the numbering scheme.

I have a somewhat rare skill in being able to see how things can play out. In my view, the longer this practice goes on, the harder and riskier it will be should adoption of the hardware start to grow.

Don't get me wrong, I love the hardware, but it's not being supported correctly as a commercial product. I know the pain is low now, again, if adoption grows, you will have much more noisy and very much less educated people demanding fixes.

It is just good business and engineering practice to fix stable and port to upstream, not ignore stable. There can't be two truths, and I'm saying the reasons for doing it as it is now are not valid nor sensible.

If everyone is going to just ignore the potential value with my premises, then we're just going to talk past each other. How about somebody engage in talking to me about my premises and discuss them and why they might actually be true. We'll collectively get nowhere if nobody is willing to put themselves in the other's shoes. I can and have our myself in the mindset of being dc, has anyone tried to argue my side in their head?

elmoret

@gnydick said in Can we have a revised release process?:

First Pass - no motors, execution, etc.

Define all of the g-code instructions I want to test

Define all parameters to be tested for each g-code

Define the expected output in terms of just response from the board that it properly received and ingested the command properly. NOT the action taken by the board

I'm assuming there is only one handler needed initially for sending commands

For each type of output, write a function to read from that endpoint

Go back and categorize each g-code by which output endpoint is needed to read the result

Using the API's, write a test case executor that reads the inputs from your definition file, detects the category, calls the proper function to read the output, and compares it to the expected output

If you can implement this ~24 man hours ("a few days", at 8hrs/day), I know several companies that would hire you. Keep in mind there are roughly ~200 G-code commands, so that's less than 7 minutes per command to flesh out a test. I don't know if I could even write all the possible permutations of a G-code (for example, M587 should realistically test for all possible types of SSIDs/passwords, to detect issues with escaped characters, etc) in less than 10 minutes, much less generate the expected response and save it to a comparison file.

Keep in mind that many g-codes would be interdependent, a simple example is M667, which selects CoreXY mode - but after selecting the mode, don't you have to test all the G-code again? What if selecting CoreXY mode and then probing the bed and then homing triggers a bug? For example in your G29 example - the bug only surfaced if z-compensation was previously active. Now all of the sudden it isn't 200 unit tests, since there are many, many possible permutations.

Again - I don't think anyone is ignoring the potential value with what you propose, we're just pointing out it may not be practical given the time/resource limitations of Duet3D. But if you want to try please don't let me discourage you! I'm sure dc42 appreciates all the help he can get.

Danal

@gnydick said in Can we have a revised release process?:

@danal I'm not sure if you read my entire original post, if not, you should.

Yes. And understood the proposal and its effects. Very thoroughly.

I don't know what everyone's backgrounds are,

I am Enterprise Architect for a Fortune 50 during the day, specializing in development process optimization. I also own/operate an electronics company at night, and our products involve firmware. I generally try to stay away from "I'm qualified because..." discussions, but, in this case, there are probably very few people who experience both sides of the coin (large/small, client, mobile, server, firmware, etc.) to the extent that I do, every day.

If everyone is going to just ignore the potential value with my premises, then we're just going to talk past each other. How about somebody engage in talking to me about my premises and discuss them and why they might actually be true. We'll collectively get nowhere if nobody is willing to put themselves in the other's shoes. I can and have our myself in the mindset of being dc, has anyone tried to argue my side in their head?

I can't speak for anyone else, but you are not talking past me, and I have modeled your process ina its effects "in my head", and I believe am fully grasping both your proposed process and its value proposition. See next post.

Danal

Summary of @gnydick's proposal:
Change from releases marked "Stable" + releases that contain both feature and bug fix, that continue toward the next "Stable". Repeat.

Change to releases marked Stable + two parallel paths, literally a code fork (in whatever VCS), one that is the prior Stable with bug fixes as they become available (and/or "hot fix" only on this path, let's not quibble over fix priority), and parallel path that contains prior Stable plus new features. These are merged at the next "Stable". Repeat.

===================

Here is where I may be "talking past you": I absolutely do understand all the effects of these two development cycles. (As an aside, I would pick a modified form of the second one for old style large teams, and/or modern "agile" teams.) I find the second one to be a negative value proposition given a one man development team.

The person-hours that go into the fork, and the merge, are a waste of a very constrained resource.

Customers, you, me, we should all move forward to, and/or regress to whatever release produces the best results for each of us. One person's desire for a new announced feature may drive them forward to stable, or even RCs, at a very different velocity than someone else. There is an individual choice to move forward or backwards to get the desired results.

All above is opinion... but there is one glaring fact: David can move forward with every kind of release, feature, fix, RC, stable, whatever, can move faster if hours do not disappear into forks and merges and most importantly the repeated regression testing after the merges.

It is therefore my opinion that velocity is more important than the existence of a separate hot fix path, again because there is always the release a given customer was running to succesfully print. Therefore, the only pressure to put oneself in a more precarious position, where hotfixes might be needed separately, is desire for a feature... and that feature will come more quickly (and arguably more stably) if David's time is not used to manage fork/merge/test.

Danal

And, like all great postings on forums where people have technical passion, multiple conversations here.

RE: A "test harness".

I just read through the release logs for about 15 minutes, looking at the bug fixes. It seems that very few would be caught by a "send this command, validate these outputs" style harness. The issue is in defining "expected output". Random examples, from recent release notes:

Fixed potential buffer overflow issues in 12864 menu code

No amount of "external harness" would catch this.

The scheduled move count was too high by 1 after an emergency pause

No amount of "external harness" would catch this.

The print progress calculated from filament used was incorrect when using a mixing tool if the sum of mix values was not 1 (e.g. when ditto printing)

Unlikely that the "test case" expected output would be coded "better" than the code itself.

On SCARA printers, a G30 command immediately after homing the proximal and distal arms could fail due to rounding errors

G30 moves Z until it hits something. i.e. a probe. A value is then stored internally. A test harness would see it probe and stop... and could never know if the internal state stored is correct or not.

Heaters were turning on momentarily when the Duet was reset

A proper "test harness" MIGHT catch this, if the work was done to parse out all outputs vs. time in a "reset" test case.

The above random sample seems to follow my reading. A "test harness" would represent a HUGE investment of time/effort, and would factually not catch the vast majority of things that are receiving bug fixes. In fact, I sort of had to "cherry pick" to find even one (the last one) that it could even theoretically catch.

It seems much better to test via printing (or machining, or whatever).

gnydick

@danal I never said my test cases were complete, in fact I explicitly scoped them to just the knowns as a starting point and example of how we could get coverage for an extreme amount of things in minimal time.

From there, we would have a good base that regressions could be added to.

I'm also curious, nobody has mentioned the scenario where duet adoption grows and how to prepare for that.

elmoret

Duet adoption grows in what regard? Volume of sales? That would result in an increase in resources (revenue) which would permit hiring additional software devs. Naturally, I'm sure some of those devs' responsibilities would involve testing and/or maintaining a separate branch like you describe.