The Cost of Quality

TL;DR It all comes down to economics. What is the cheapest possible way for a company to find defects in its application? For some it's - let the users find problems. For others - extensive in-house testing before each release. Apples and oranges.

Recently I happened upon a book called “The Leprechauns of Software Engineering”. The premise is that, in software, there are things that we get for granted and believed in blindly. For some of those assumptions there are no (reliable) studies or scientific data to back them up.

You can also listen to an episode of Ruby Rogues podcast starring the book author.

One of those assumptions is that it is more expensive to fix a defect the later it’s found in the development cycle (planning, development, testing, release to customer). This is known as Boehm’s curve.

The author goes into great lengths to dispel this “myth”. He investigates a number of papers on the subject, starting with the original one by Boehm, and finding all of them not “plausible” enough. Basically he dislikes the type of participants in the surveys — students vs regular developers, the size of the projects — small vs large, incorrectly sited data from other sources. To me this looks like overstretch, a bit sensational, a bit tabloid like.

Maybe the original Boehm's data is wrong. Maybe the graph is not exactly exponential. Maybe it varies from industry to industry. But this does not change the fact that it’s definitely more expensive to fix a bug found by your customers, rather than fixing the same bug before it reaches your customers. Yes, there may not be comprehensive studies about this, but it is painfully obvious to everyone that has more than a couple of year of development experience.

I didn’t want to write a post about something that is so clear, like the water is wet. We know this and feel it every day. I want the unexperienced and young developers to get the right impression when reading this book. They should not produce crap. They should do what they can to produce quality software the first time.

Some Theory

“It is always cheaper to do the job right the first time.”

Phil Crosby

“One gets a good rating for fighting a fire. The result is visible; can be quantified. If you do it right the first time, you are invisible. You satisfied the requirements. That is your job. Mess it up, and correct is later, you become a hero.”

W. Edwards Deming

Moving to the software world now. There are different industries and companies. The cost of a late defect depends on a number of factors:

  • Is the product free or paid? As a customer, you don’t pay for using Facebook with money. You pay in other ways — your time, your data, your actions. If there is a problem with Facebook, who are you going to call? It’s free after all. Facebook boasts that they don’t have a testing role (and it shows), but for them it does not make economical sense to have one. They have so many users and partners that report bugs every day. On the other hand think of a paid service like Paypal, if there is a defect, money is directly lost (more on working in the financial industry later). So, all things being equal, free services do not require high quality out of the door.

  • Is the company a startup or an established one? In case of a startup, you’re trying to prove your idea first. Bugs don’t matter much as long as you find the right product/market fit. As long as you have a runway you can pivot to other solutions. You’ll worry about bugs later. If customers love your idea, they will not reject your app because of a few bugs. On the other hand, if you’re an established company, you have a reputation to keep and can't afford much defects. Customers will run to your competitors.

  • How often you can deploy your product? If your product is SaaS you can deploy whenever you like, even to fix a single typo. If you release once a year — say you’re developing a software that’s being distributed on CDs - then the price of mistakes is high. You may not get a chance to release a fix until the next year. Mobile apps also fall into this category. You can release whenever you like, but currently there is no way to force you customers to upgrade. You’re stuck with supporting the older versions. Software written for the internet of things is even more problematic. How often are you updating the firmware of your Smart TV? To sum it up, the bugs are cheaper to fix, if you are able to deploy frequently.

  • Are you the only game in town? Are you a monopoly? You are monopoly if you are: Facebook, NASA, specialized software used only in house, The Guardian etc. What do the customers of Facebook do if there is a bug? Use MySpace? No, just sit and wait for a fix.

  • Can your software directly endanger human lives? The price to pay for a defect in x-ray software is way, way higher than, say for a Twitter application. Also true for avionics software, medical equipment, nuclear power plants.

As you can see, there is no way that the cost of a defect is equal in all types of software. But one things is certain, the price is always higher when fixing late, it’s not cheaper.

The True Cost

Here is what happens when a bug is found by one of your client. He pick up the phone or writes an email to your customer facing department to complain. This department logs a ticket to the appropriate teams. In some cases there are multiple layers through which the support ticket passes — the first line of defense, technical level 1, technical level 2. The defect description ends up in the backlog of the development team. Usually the product owner decides when the bug will be fixed and schedules it. Then a developer picks the task, fixes the bug. If there is a designated testing role, then the fix is being tested and the bug is marked as fixed. Now, using the same chain, the customer is notified (most of the time). All of those people are being paid. If the quality was right the first time, you would not need that big of a customer service department, or that many developers, or testers. You can keep the same amount of people, but they can do value added work. If only the quality was higher the first time.

The cost of the bad quality in the above case can be calculated pretty easily: ‘the number of people involved’ x ‘how much time they are involved’ x ‘the hourly rate’. For more details on how much bad quality costs in the long run you can read Phil Crosby’s book “Quality is Free”.

But there are is one cost that is hidden, and it’s not easy to calculate it — your reputation loss. If you have a defect, the customer will complain, which means you’re wasting not only yours but also his valuable time. Besides complaining to you, he will say bad things about your product to his friends/coworkers/family/social network. In the end they will not buy from you. Missing sales opportunities — how can you know how much are they? How can you measure them? The only way to make sure this does not happen is if you get the quality right from the beginning.

If a defect is found while still in development it can be fixed faster. Why? Because the problem is still fresh in the developer’s mind. When you release the product to your customers, the developers are moving to work on other features. Two weeks later when a defect report comes in, the developer (if he is the same) needs to shift back it’s focus from the current task. He needs to remember the problem domain or to learn about this functionality altogether (if its is a different developer than the one who originally created the functionality).

Most of the time, its also hard to debug and test on production environment. The audit logs may be inadequate, or you may not be able to debug due to performance and security concerns.

The cost of the actual fix is the same. Delete one line and then commit it, in case of the ‘goto fail;’ bug. But the consequences are quite different depending on what point in time you fix this bug.

Examples

Still not persuaded that the price to fix a bug when the product is released is higher? Here are some real life examples. Of course for all the software written in the whole world, not all the bugs are causing that much problems, but here a few outliers just to make a point.

Oh, but you’ll say, I’m not developing such a software. That’s fine, and here is one example for my career (there are others but this may be the most costly)

A while ago, I was working for a finance company. Every defect costed us money. Literally. 400 EUR here, 1200 EUR there. Slightly off FX rate, small rounding error. This was the cost of doing business and it was considered normal. One day however, due to a bug, one of our customers woke up and found 100,000 EUR more in his account. He immediately started to withdraw the money out of our system. We figured out what happened and managed to stop 20,000 EUR from withdrawing, but for the rest it was too late. This bug costed us 80,000 EUR.

Conclusion

There is no need for comprehensive studies on the exact cost of the defects found later in the development cycle. We, as practitioners know that the price is higher. We know it because we see it every day.