The Problem with Quality Metrics

The Problem with Quality Metrics

Wooden Ruler

CA Technologies came out with a report, “The Impact of Agile. Quantified,” which looks at four dimensions of performance: Responsiveness, quality, productivity and predictability.

The report compares these performance metrics in different situations. For example, when you have dedicated team members, the numbers show your productivity, measured with throughput, doubles.

The report notes, however, that correlation does not necessarily mean causation:

For example, just because we show that teams with low average WiP have one-quarter as many defects as teams with high WiP, doesn’t necessarily mean that if you lower your WiP, you’ll reduce your defect density to one-quarter of what it is now. The effect may be partially or wholly related to some other underlying mechanism.

The study included comparisons of performance metrics based on how teams performed estimates. The four different estimating processes that were compared were

  1. No estimates
  2. “Full Scrum” with tasks estimated in hours
  3. “Lightweight Scrum” with estimates using story points only
  4. Only task hours estimated

The key findings were:

  • Teams doing full Scrum have 250 percent better quality than teams doing no estimating.
  • Lightweight Scrum performs better overall, with better productivity, predictability and responsiveness.

Personally, I think that this is a case where, despite the “key findings,” I don’t think a team’s estimating processes is the cause of differences in quality. In fact, I don’t trust any of the numbers having to do with quality because I don’t agree with the way quality is being measured.

My issue with the report is the same issue I have with many other quality reports: Quality is measured by number of defects alone, when in reality, those numbers can be very misleading.

Having spent many years as a QA Manager, I know that defect counts are very commonly used in quality measurements, and as I wrote about in this article, I believe that is a very poor indication of quality, especially when you’re comparing across different projects.

Here are some of the problems with using defect count as a measure of quality:

  1. Defect count does not take into account the severity of the defect.
  2. Defect count does not take into account the definition of a defect, which may vary greatly on different products.
  3. Defect count does not take into account the way the defects are being collected.
  4. And perhaps, most importantly, defect count does not take into account the number of users using the product.

This last point was made clear to me when I was working on a team which was given an award for “Best Quality” because there were zero reported defects. The application had been internally released, though I hadn’t felt it was ready because testers were easily still finding bugs.

While I was happy to get an award, it didn’t make sense to me that this application could win a quality award. On further investigation, I found out that the reason there were no customer defects reported was because not a single person was using the new application.

On further reflection, I realized that the applications which typically had the most reported bugs were those which were used the most. I realized that these were not necessarily poorer quality than other applications; it’s just that they had far more users and the users cared enough about the application to take the time to report the bugs.

There are no easy answers when it comes to measuring quality, or many of the other performance factors, for that matter. Teams that use “full Scrum” may, in fact, generally produce higher quality code than those who don’t estimate.

However, if you want to improve your quality, I think teams would be best to focus on getting feedback from their customers. Make it easy for your customers to give you feedback.

We want to eliminate defects, sure. However, let’s remember that quality is about more than defect counts. We need to hear from our customers, and whether you call that feedback “requests for enhancements,” “defects” or “customer comments,” consider that feedback a positive thing that will help us continue to improve.


Yvette Francino has more than 30 years in the software development industry, and is an independent consultant, experienced agile leader, coach, author and trainer in various methodologies including SAFe, Scrum, Kanban and large-scale custom methodologies.

Learn More
comments powered by Disqus