Why Doesn’t Stuff Just Work As It Should?

Steve Fox of PCWorld raised some timely questions in his March 2010 column Tech Products: Revolting not Not Rebelling :

… our state-of-the-art technology too often fails to work as it should. That’s why I have to reboot my Wi-Fi router at least once a week; why my fingerprint-recognition pad periodically forgets what my thumb looks like; and why my smartphone keeps dropping calls without provocation.

Mostly, I think the answer likes in our neglected software development process. In darker moments, many of us probably suspect that our software vendors hire besotted programmers to code their operating systems and mission-critical software in bars and back alleys. In truth, a coding project like a modern Mac-OS-X or Windows 7 may rival the Manhattan Project in resources and organizational complexity. When things go south, where did we go wrong?

Funny, I read the Steve Fox article while rewriting a coding project of my own. I’m re-engineering a web app I had working perfectly about 10 years ago. Over the years, I added features which didn’t always play nicely with the existing code. With patches and add-ons, tracing the code execution became an exercise in frustration.

I also discovered that code I “locked down” years ago no longer works as originally designed and tested. In recent re-testing of old code, I found that some features I added in 2006 couldn’t possibly have ever worked. Yet such wasn’t actually the case. Software environments change over the years, and the code must change with them.

With 17 years in the software QA industry, I often recognized this kind of code in my own project. We called it “spaghetti code.” When you’re committed to preserving as much of the old code as possible, you’re wedded to old designs that may not be appropriate for the enhanced project. Why hadn’t I designed a more modular web app?

Well, I know more now. I have years of professional experience testing software, but as an amateur programmer, I’m still learning basic techniques and standards that were part and parcel of  industry standards decades ago. Standards aren’t the problem with the major developers in the world that revolves around Silicon Valley.

So, what exactly IS the problem? Marketing deadlines and budget constraints certainly are major bugaboos for developers. Established standards are ignored or marginalized. In truth, every code change is a calculated exercise in risk-taking. “It should work” isn’t acceptable in aeronautical engineering. It shouldn’t be in software development either.

As I fix hidden defects in my own program, it seems like every fix opens the door for two new bugs. That tells me about where I am in the “development cycle” – not quite over the hill yet!

QA Development Cycle

QA Development Cycle

QA Development Cycle - click image for full-size graphic

My scratch-pad doodle shows life cycle of a software product, in terms of bugs found and fixed, on its path from programmer’s desk to client or end-user. This discussion talks in terms of a software enhancement. When we think about it, the same principles apply to routers, smartphones and even microwave ovens, autos and airplanes.

  • if you recognize this graph as the classic “Bell Curve”, go to the head of the class.
  • vertical axis: bug “Severity”. Major bugs are “showstoppers”; minor bugs are annoying but don’t render the product unusable.
  • “Showstoppers” render the entire application unusable (such as a crash), or a major feature is inoperable, or produces a wrong result. There is no workaround for a showstopper, hence its highest severity.
  • Cosmetic bugs are appearance issues. They reflect badly on the product and annoy the user. Lower severity.
  • Functional bugs with lower severity have a workaround, or we can get past them without incorrect data or data  corruption.
  • horizontal axis: point in time in the development cycle.

Unit testing is mostly done by the programmer before the QA, QC or testing department ever sees it. Smaller modules of code are tested in a static setup against a spec sheet or checklist. Once all of the expected results can be created for all planned test conditions, the module is ready for “system testing”.

System testing requires that the module talk to the existing application as a whole. It should produce the expected results of the enhancement without breaking anything. This is where the bulk of testing should take place.

Beta and Acceptance testing is conducted by designated users in the client community to see if the product is ready for upgrade or replacement of an existing “production” product.

Understanding The Curve

You don’t need Statistics 101 to follow most of what the curve is telling us.

Programmer testing only suggests that the enhancement is ready for system testing. Getting early coding to run at all is just a part of the programmer’s job and doesn’t count in the bell curve. Just as pilots have to have a fundamental conviction that airplanes will fly, programmers are constitutionally constructed to believe the design is sound and can be made to produce the expected results.

Software testers are constitutionally built to believe that every product has serious latent flaws, and that they are capable of setting up the test conditions to expose them. Some “showstopper” bugs can’t be re-created in a real-world environment.

The real story is in the shape of the curve. Some shops don’t believe in wasting valuable resource dollars on “excessive” quality control. If fewer severe bugs are found, the curve will have a narrower “hill” and its peak will occur early in system testing. The likelihood increases that a “showstopper” will make it all the way to the end user’s desk. If the severity curve “flattens” –  does not clearly go down by midpoint in budgeted system testing (the middle portion of the curve) – the project is in trouble.

A complete project flow-chart “map” looks like a street map of a real estate development project or even a small town. A flow chart of a poorly designed legacy product can look like a map of the New York City sewer system.  A development QA effort that never really finds showstoppers is probably in deeper trouble.

The Sterilized Answer

In analytical terms, the reason we get dumb smartphones, home finance software that won’t install, and operating systems that won’t shut down is: inadequate testing resources shift the development bell curve to the left, where there’s less motivation and training to see the system from the user’s point of view. “It ought to work” is NOT a language spoken in the homes and offices of the end user.

The Bottom-line Answer

Complex systems require complex testing and a clever, flexible and rigorous strategy to ensure the product meets end user expectations and needs. This is true whether the product is a software office suite or a Ford Pinto. It’s never unfair to call a slipshod product “slipshod”.

Budget cuts, layoffs and outsourcing can kill a product. A company that cannot afford good training and documentation is particularly unlikely to be able to supply replacement offshore development teams with the resources they need to assimilate the legacy “knowledge base.”

When complex systems fail, all too often the defect was cast-in-stone way back in the design stage, and it was never detected (or it was, but was ignored) due to rushed development. Inadequate testing can only be explained by budget shortfalls and unrealistic release deadlines.

I’m guessing on the numbers, but the millions of lines of code in a modern operating system should be approximately comparable to all the software NASA used to put men on the moon in 1969. And even NASA-grade software has been known to fail for non-hardware reasons. When we think about it, it’s nothing short of a miracle our software works as well is it does.

375 total views, 1 views today