Measuring the Progress of Agile

The eleventh State of Agile survey has just been published by VersionOne. These reports are invaluable in helping agile practitioners understand where their practices, problems and challenges fit the context of the wider world. 

As with all VersionOne’s previous reports, the eleventh survey paints a picture of the onward march of agile. Its progress is rarely in a straight line, however, and this latest survey has revealed a very interesting contradiction.

Increased Focus on Business Value?

When respondents were asked how the success of agile initiatives were measured, the second ranked answer after “on-time delivery” was “business value.” In the previous report, “business value” was the fourth-ranked answer. This time, some 46 percent of respondents chose it ahead of “customer/user satisfaction” (44 percent), “product quality” (42percent) and “product scope” (40 percent). No other answer was given by more than a quarter of respondents.

When, in the same survey, participants were asked how the success of agile projects (as opposed to initiatives) are being measured, the percentage who answered “business value” fell by half. The implication here would seem to be that the day-to-day metrics collected about the work of agile teams were not primarily focused on business value. In fact, there were eleven other measurements that scored higher than business value. Those measurements were (in ranking order):

Velocity (67 percent) Sprint burndown (51 percent) Release burndown (38 percent) Planned vs. actual stories per iteration (37 percent) Burn-up chart (34 percent) Work in Progress (32 percent) Defects into production (30 percent) Customer/user satisfaction (28 percent) Planned vs. actual release dates (26 percent) Cycle time (23 percent) Defects over time (23 percent)  Old Muscle Memory

The list of day-to-day metrics being collected shows us the grip that the old muscle memory of waterfall still has on the agile community. The mantra of traditional project management is “Plan the Work, Work the Plan.” It assumes predictability and, consequently, the metrics that are believed to be important are those which show whether there is deviation from the masterplan. Any discrepancies are considered likely to be due to a lack of efficiency.

Velocity, for example, which is right at the top of the list above, tells us nothing about progress or success. Rather, it is a metric which is useful to the development team because it allows them to judge how much work they can pull into a sprint.

Nobody else -- with the possible exception of the product owner, who can use target velocity to estimate release dates -- should be interested in velocity. When managers try to drive up a team’s velocity, it almost always causes the defect rate to peak and the delivery of value to the customer to slow down.

The next four measurements in the list, and that of “planned vs. actual release dates” which comes further down, are all about whether the team is working to plan. Again, these can be very useful to the team itself so that it can decide whether its own plan needs adjustment to achieve a sprint goal or a release goal. Used by anyone else, they just offer opportunities for micromanagement. 

“Work in progress” and “cycle time” are useful for measuring the smoothness (or lack thereof) of the development and delivery pipeline, while the “defects into production” and “defects over time” can tell us something about the quality of the product.

Progress is in the Product

In short, these can all be useful measurements, but apart from measuring customer/user satisfaction, they are at best secondary when it comes to measuring progress. The value -- and therefore the success of the project -- is in the product and nowhere else. The most crucial factors to measure are the product’s delivery and its impact on the world.

If management, stakeholders or anyone else outside the agile team wants to know the progress being made then all they need do is show up to the sprint review where (in Scrum, at least) they will get to see the latest increment and can suggest what might be done next. Everything else should be left to the team itself.

Please feel free to use the comments section below to tell me if you are surprised (or unsurprised) by the survey’s findings or if there are any additional key factors you use to measure your team’s progress.


Are We Done Yet?

The definition of done (DoD) is one of the most important and least-understood elements of the Scrum Framework. It is specifically called out in “The Scrum Guide” in what is probably its biggest section, and yet, I’ve seen so-called ‘definitions’ of Scrum that fail to mention it at all.

In this post, we’ll be talking about why, exactly, the DoD is so important. 

DoD Explained

So, what is the definition of done? Fundamentally, it is the Scrum team’s agreement about the standard of quality that it will apply across the product. This concept is closely related to that of the Potentially Shippable Increment that must be created at the end of each and every sprint. The two words in that phrase that the DoD concerns are “potentially” and “increment." 

While all agile approaches – Scrum included – aspire to “deliver early and deliver often,” this does not mean that a product must be handed over to the customer at the end of every sprint. Whether enough useful value has been accumulated to warrant a product’s release is a business decision, and one that is the product owner’s responsibility to make.

If, however, the product is not of releasable quality, then the product owner is effectively relieved of that responsibility. So, scrum requires that the latest increment -- whether it is going to be released or not -- is of sufficient quality that it could be handed over.

What the product owner and the development team are agreeing on when they establish the DoD is the quality bar that will determine what can be shown in the sprint review. You may have heard phrases like “done, done” and even “done, done, done” from some agile practitioners, but in the world of Scrum there is only “done” and “not done.”  

An increment – and that’s the fundamental level at which the DoD works – is done only if it meets the definition of done and can be demonstrated in the sprint review. If it doesn’t meet that standard, then it cannot be shown to the stakeholders.

At the very least, the DoD should mean that the increment has passed all its tests and is fully integrated with the previous increments. In this way, what is being shown is a quality-assured, small and skinny version of the product.


Clearly, the DoD has implications for governance as well as quality. “The Scrum Guide” says that if there are corporate standards in place then they form the default DoD. My interpretation of this is that if, say, there is a company standard for code quality, then that standard should be incorporated into the development team’s practices as well.

Agilists never trade quality for speed, and so we should never lower that standard. In some cases, however, existing corporate standards might get in the way of efficient development. Organizations that use PRINCE2, for example, will often have a phase gate-based governance approach.

Such an approach might work well in a product development that has a high level of predictability, but where there are many unknowns (as in most software development) an approach that is instead based on feedback and responsiveness is needed.

Because of its reliance on predictability, phase gate governance can kill an agile product development. So, in organisations that use phase gate governance, the Scrum team will need to have a conversation with the wider organisation to find better ways of giving stakeholders confidence.


A DoD which clarifies what is needed for the sprint review and which is enacted by the Scrum team accordingly is the perfect starting point for feedback-based governance. Since there is no one-size-fits-all DoD, what the DoD includes is always going to be situational. But, if we accept that the increment must be fully tested and integrated, then several practices naturally suggest themselves.

First, each PBI which makes up the increment will need to have passed both its acceptance and unit tests. Acceptance tests are specific to each PBI, of course, but the DoD will presumably state the policy that all items must pass their acceptance tests to be accepted.

The entire increment will also need to pass integration testing and, as it is unlikely that all the PBIs will be finished at the same time, each of those will need to be integrated incrementally. Therefore, regression and integration testing is strongly suggested at the PBI level. 

In other words, there are implications about workflow embedded in the DoD.


There is yet another aspect to the importance of DoD: the team.

A group of individuals is just that, and can only become a genuine team when it rallies around a common goal. But how can a team meet a goal if they don’t know when their work is finished? 

I once interviewed two programmers on the same team and asked them how they dealt with quality. One of them opened up his IDE and showed me the unit and acceptance tests he ran on his code. The other told me it was the QA department’s job to deal with quality – not his. As you can imagine, this product (and the group building it) did not fare well in the end.

Those were two people with the same functional background-- imagine having a new development team, with all the different skill sets needed to create the product, and no clear DoD. The result certainly isn’t pretty.

Revisiting the DoD

Scrum teams, for various reasons, may not be able to take product increments to a potentially releasable level every sprint. This might be due to the team’s level of performance (if they are a new team, for example), or because the production environment could not be fully replicated in the team’s development environment. 

In this situation, it is important that the DoD clearly indicates where the team’s responsibility ends. Any work that would still be needed to make an increment releasable would be categorized as “undone work” and would need to be listed in the DoD. The DoD would then be retrospected regularly to see what could be moved from the “undone” category into the “done” category as the team takes quality assurance more and more into its own routines.

“Undone” work should not be confused with unfinished work. Work required by the DoD which is unfinished means the item concerned is not “done” and thus cannot go into the review. An item can still be “done” if there is “undone” work.

To understand this, we can think of there being two quality bars: one for the sprint review and one for actual release. The gap between them is the undone work. In committing to any sprint goal, the team is implicitly committing to the DoD as well.

However, it is not committing to do the work listed as undone in every sprint -- that work will be done just prior to an actual release The development team’s job is simply to get the increment to “done,” and the product owner then decides whether the PBIs that are part of it can be accepted as complete.

So what happens if an item shown in the sprint review is rejected as not fit for purpose by the customer? The answer is that, since it passed the scrum team’s standards, it remains ‘done.” If the product owner believes the requirement still has value, he or she will put it into the product backlog as a new item and it will be prioritized accordingly. But, of course, the team should still reflect on what has happened, and may well strengthen the DoD to reduce the chances of “done” items being rejected in the future. 


So far, we’ve seen that the definition of done is important for: 

Product-wide quality standards Governance Workflow and engineering practices Team-building

We’ve also seen that Scrum teams need to revisit their current definition of done on a regular basis to strengthen their assurance of the product’s quality.

If you can think of any other things that the DoD is important for, feel free to let me know in the comments section below. And with that, we’re done.

How to Get Your Teams to Estimate Better

“We need to get better at estimating,” an experienced member of a Scrum development team once told me. “Management is getting concerned that we keep coming up short on our commitment.”

“Really?” I responded. “What have you been committing to?”

“Thirty story points” she said. “We get there about 50% of the time. In a couple of recent sprints, we’ve even exceeded thirty points, but the last sprint marked the third time this quarter that we fell short of our target.”

“Why was your forecast off target, do you think?” I asked.

“Well, things come out of left field occasionally. You know, stuff that couldn’t be anticipated in sprint planning,” she answered.

“So, why are you estimating effort if you can’t predict what will happen in the sprint?” I said.

Now, at this point I want to make clear that I am not one of those who say that development teams should not estimate effort. For me, the ability to estimate independently is an important part of the autonomy of teams. What I was trying to do in this particular conversation was to get the team member to consider why teams estimate in the first place, and what “commitment” means in that context.

It is not just a matter of those doing the work being the only ones who are able to estimate effort, which I believe to be true. I recently participated in a large multi-team project of effort being estimated by a centralized systems analysis unit. In that project, the development teams were involved in estimating, but it was the systems analysts’ forecast that was being communicated to the customer. The result was massively inflated customer expectations, and retrospection revealed that the teams’ estimates were about four to five times larger—and thus far more accurate--than those of the analysts.

Why Estimate Effort?

So, development teams make the best and most accurate estimations. That’s all well and good, but it still doesn’t answer the question of why estimates are necessary in the first place.

Scrum is a “pull” system in that it balances demand with the team’s capacity to do work by giving the team control over how much work is brought into the sprint. No one can tell a development team how to do its work, or how much work it must do.  Effort is estimated – in story points, in the number of stories or in person-hours – so that a judgement can be made about how much work can be pulled in from the product backlog in order to meet the sprint goal.

When either team members or their managers start to fret about the team’s commitment in terms of its target velocity, it is a sure sign of old muscle memory kicking in. After all, the velocity is a forecast, not a commitment. The team’s commitment is not to its velocity but to the goal that was agreed upon in the sprint planning. And, in any agile environment, commitment means that, given the information we have right now, the resources we have at our disposal and our judgement about our capacity to do work, we think we can get to a certain point by the end of the sprint, and we are going to do everything in our power to get there.

Therefore, a commitment can never be equated to a guarantee. It is akin to the moment a football team walks onto the pitch: the team is committed to winning the game, but there are too many variables in play (not the least of which is the opposing team!) for victory to be guaranteed from the start.

When managers ask me questions about the target velocity of the teams, my first response is to ask them why it should be any concern of theirs. They typically mumble a reply about making sure the team is working to plan, or being efficient. A brief chat normally follows in which I point out that management’s concern should be with business outcomes and with making sure the teams have the environments and tools they need to deliver them, not with the team’s tasks.

Target Velocity and the Sprint Goal

So, at this point we’ve established that target velocity is a concern of the development team and no one else (OK, the product owner needs to know about velocities so he or she can make trade-offs in the ordering of the product backlog, but that’s another story entirely). Estimates simply need to be accurate enough for the team to confidently identify the items at the top of the product backlog that can be worked on to meet the sprint goal.

A good product owner will give the developers extra wriggle room by articulating the sprint goal to customers and stakeholders independently of the product backlog items (PBIs) or the stories that might constitute it. By giving the upcoming increment a named state, or by describing the expected value in a few sentences, the product owner can work with the team if a mid-sprint descoping of the forecast is needed, without compromising the sprint goal.

All of this means that while the team’s estimates do not demand the same precision as task estimation in waterfall-based planning, they do still need to be accurate. Actual velocities that go up and down like a roller coaster help no one, and some level of predictability is needed. I’ve seen teams agonize over what a 13-point story should look like, and have even been shown handbooks in which reference PBIs are described for each number in the Fibonacci series. In my experience, this type of overly complicated approach never works.

"Smaller" is More Predictable

There is only one way to improve the estimate of effort for PBIs: breaking them down into smaller PBIs. Think for a moment about the values most often used in planning poker, for example: 1, 2, 3, 5, 8, 13, 20, 40 and 100. They loosely follow the sequence in which each number is the sum of the previous two.

This raises the question, why “20” instead of “21”? Simply, because “21” would be too precise. In effect, we are saying that anything more than 13 can be considered large, and will probably need to be decomposed.

The larger numbers in the sequence and, more specifically, the larger gaps between them, reflect a corresponding level of uncertainty. Suppose the team thinks a PBI is bigger than an 8, but is probably not big enough to be a 13. In that case, it might “precisely” be a 9, 10, 11 or 12. However, they can only categorize the PBI as either an 8 or a 13. Essentially, the larger a PBI is, the greater its possible variation from its predicted effort. On the other hand, if all the stories are, say, a 5 or smaller, then it is only when a PBI sits between a 3 and a 5 that you begin to see significant variance.

It doesn’t really matter whether the team uses stories and story points or not, as long as the underlying goal is still the same: to break the PBIs down to the smallest size they can be and fit a number of them into the sprint. In my view, this is the only way that a team can reliably improve the accuracy of their estimations.

Is Kanban Always a “Pull” System?

The pushmi-pullyu (pronounced “push me-pull you”) is a fictional creature in “The Story of Doctor Dolittle,” which is described as a cross between a gazelle and a unicorn.

Recently, I was talking with a fellow software professional about some issues he was having implementing Kanban, and the pushmi-pullyu came to mind. Allow me to explain why.

Kanban’s Popularity

Kanban is the most popular agile approach after Scrum, according to the most recently published State of Agile Survey. Scrum is, of course, massively dominant.

Only 5 percent of respondents described their teams as Kanban teams, while three-quarters used either Scrum, a Scrum hybrid or Scrumban. However, when asked what techniques they used, nearly four in every 10 teams said that they use Kanban boards.

This is might be an exaggerated figure since, in my experience, many people confuse Kanban boards with Scrum boards. They look similar, but have very different purposes.

While a Kanban board maps a value stream for the lifetime of the product, a Scrum board is a visualization of a sprint backlog, and is reset at the beginning of every new sprint.

Additionally, the Scrum board is owned by a single Scrum development team, while Kanban is agnostic about who owns the board. These differences turned out to be at the heart of the issue that my colleague raised with me.


Kanban means “signal card” in Japanese, but it was by observing the processes at work in U.S. supermarkets in the 1940s that Taiichi Ohno was inspired to include the technique in what later became the Toyota production system.

From there, the Toyota production system gave birth to lean manufacturing, which spawned lean software development.  Kanban in software emerged from this bloodline.

The core idea of Kanban is that no downstream process is sent additional work from upstream unless it has displayed a visual signal that it has the capacity to handle that work. 

Kanban signals are weapons used to eliminate the waste of inventory, such as work items that are queued or waiting to be processed.


Kanban (in software) implements this approach through the use of explicit work in progress (WIP) limits for each value-added stage in the workflow. These WIP limits are shown on Kanban boards, and are their most recognizable characteristic.

If, for example, there is a WIP limit in testing of five tickets, then whoever is responsible for testing should not accept a sixth ticket unless and until the number they are currently working on drops below five.

As a result, anything that is blocked tends to clog the entire pipeline. If enough tickets get blocked, then the entire value stream will grind to a halt. Then, everyone in the process gets involved in unblocking the pipeline and restoring flow. At least, that’s the idea.

Used properly, a Kanban system is a “pull” system where demand and capacity are balanced using explicit WIP limits. Scrum is also a “pull” system, in which a development team pulls a small batch of work into the sprint based on its estimate of how much work it can complete in the given timebox.

Scrum works because the team is self-organising. No one can tell the team how to do its work, or dictate how much work will be completed in a sprint.

Push vs. Pull

However, Kanban does not demand that a team be self-organising, and each value-adding stage (each column on the Kanban board) might reflect the work of a different team. So, the question is: Who sets the WIP limits? Again, Kanban leaves that open to interpretation.

In my colleague’s case, management set the WIP limits on the board. When the workflow got blocked in one particular area, managers came up with the “solution” of raising its WIP limit.

Of course, this solved nothing. In effect, management just told the team concerned to work harder. Thus, what was intended to be a “pull” system instantly transformed into a “push” system.

Now, this isn’t to say that I’m knocking Kanban. I’ve worked with a number of Scrum teams that use Kanban boards to good effect. But, in every case, it was the teams that set the WIP limits, which removed the temptation of managers to bully the team into doing an inordinate amount of work.

There’s no magic involved in Kanban boards, or in Scrum boards for that matter. Both tools simply serve as a visualisation of the work in progress. It is what you do as a result of what you see on the board that matters.

To me, the idea that you can achieve genuine agility without self-organising teams is just as fictional as Dr. Dolittle’s pushmi-pullyu.






The Product Owner's Role in Team Building

Self-organising, self-managing teams are at the heart of Scrum. Everyone knows that. Most people are clear that, as an agile coach, the Scrum Master bears the main responsibility for growing the team.

Less understood is the idea that everyone in a Scrum team has a role in team building if it is to be successful. Geoff Watts has written about how collaborative behaviour is a skill that has to be acquired individually by members of the development team.

But what about the product owner—what can she contribute?

Let’s segue into the world of team sports for clues to an answer. Jose Mourinho is possibly the most successful club coach currently in world soccer.

He has been a head coach since only 2002, and has spent two of the intervening years on gardening leave, yet his teams have already won an astonishing 24 titles in four different countries: Portugal, England, Italy and Spain. He has coached two sides, Porto and Inter Milan, to be European Champions.

What is it that he knows about team building that we don’t?

Mourinho has famously said that a group of individuals is just that, and they only become a team when they are “seduced by a common objective.” He not only identifies the trophies the team should target for the season and sets objectives for each and every game, but also for every single training session.

His record suggests he’s on to something. But the operative word in the phrase quoted above is “seduced.” In other words, the group has to take ownership of the objectives themselves, and commit to helping one another deliver them.

Moving back to Scrum, it is the product owner who sets targets and identifies objectives. She upholds the vision of the product for the business. She will probably map out release goals on a product roadmap that she maintains.

She will typically open a sprint planning meeting by pointing to the value expected at the end of the upcoming sprint, and will agree on a sprint goal with the development team before the meeting has finished.

Setting goals is one thing; however, convincing everyone that they are the right goals is something else. This is where the team building aspect kicks in. It requires intensive collaboration with customers and stakeholders on the one hand, and with the development team on the other.

An engaged product owner is, in my experience, a minimum requirement for building successful teams.

Scrum Masters can help product owners by coaching them on the responsibility not only for “what” work needs to be done (by ordering the product backlog appropriately), but also “why” it needs to be done.

The more a development team understands the rationale behind an objective, whether at product, release or sprint level, the more likely they are to deliver on the “how.”

Let’s take the sprint goal, for example. It is often narrowly translated as the set of product backlog Items (PBIs) or stories that the team pulls into the sprint. Great product owners will try and get the team excited about the sprint’s value proposition before asking it to forecast how much work it can take on, and choose the PBIs from the top of the backlog that best fulfil that idea.

The sprint goal can be agreed as a name for the state the product is expected to be in at the end of the iteration, or, alternatively, as a short statement describing the value it brings to the customer. These are not the only two forms the sprint goal can take, but they are probably the most common.

I personally like to post the sprint goal somewhere on the Scrum task board as a permanent reminder to the development team of what they are currently trying to achieve.

There are a couple of advantages to separating the sprint goal out from the team’s forecast in terms of PBIs. First, it allows the product owner to report the target to customers and stakeholders in business terms they can understand.

Secondly, and more importantly, it gives the development team a reference point – a compass if you will – which can frame any decisions they might have to make in adjusting the sprint backlog in the face of unforeseen circumstances.

It also gives them some wriggle room; if they find they have overcommitted in terms of their forecast, they can discuss how to de-scope by choosing the PBI with the least impact on the overall sprint goal. Teams build their cohesion and their agile fluency through making such decisions.

Soccer coaches often talk about their role ending once the team has crossed the white line onto the pitch. Once the game starts, the team has to make collective decisions about how to defend and how to attack as the opposition’s game plan unfolds.

A development team’s ability to act autonomously also rests on the preparation that takes place before the sprint starts. The product owner’s role in that preparation should not be underestimated.