Overview
There are no
“standard” agile metrics, because what's easy to measure tends to distract us,
and what's important to measure is hard to quantify. The most important thing about Agile metrics
is that we need to have a clear objective for using them--normally start with a
hypothesis that, say, if defect count drops then lead time will go down too.
Many of the metrics below would be used for a short period of time and then dropped when the objective is reached.
With that being
said, I've seen clients dovetail push-and-pull metrics, like lead-time &
defect count, so that they can be used for a longer period of time, and that if
someone starts gaming one number they get penalized on the other.
The following
may also be useful:
· Lead Time
· Defect count (at various phases; what’s a
bug?)
· Work in Progress
· Code coverage
· Unplanned Changes
· Velocity (story points or story count per
sprint)
· Return on investment
· Innovations per sprint
· Artifacts generated
· Slack time
· Failure Load (firefighting time)
· Iteration Burn-Down
· Unfinished Stories
· Customer Satisfaction
· LOC (lines of code)
· Un-deployed Stories
· # Blocks
· Budget/Schedule Compliance
· Flow Efficiency (lead time / touch time)
· Release Burn-Up
Definitions
Definitions and
cross-references for all these metrics follow.
Lead Time
Defined: Time from “concept to cash”, the
total time it takes to develop an idea and sell it to a paying customer.
Caution: It may be difficult to measure
actual lead time, and many teams approximate lead time by capturing the time
a request enters the development process and capturing the time it reaches the
definition of done. This approximation may be a reasonable place to start
measuring, but it may cause micro-optimization (changes that actually detract
from corporate goals) or reduce customer discovery (learning what the customer
would pay more for)
Side Effects: If we blindly push this metric to a
minimum, we may see:
·
increased defect count
·
reduced code coverage
·
increased failure load
Benefits: customer satisfaction, flow
efficiency, un-deployed stories, work in progress
Defect Count
Defined: Total count of surprises, unexpected
behavior, flaws, and shortcomings of the product identified during or after an
iteration demo.
Caution: Aggressive definitions of “defect”
help everyone focus on customer satisfaction; anything short of the definition
above will
Side Effects: If we blindly push this metric to a
minimum, we may see:
·
reduced velocity
·
reduced innovation
·
more unfinished stories
·
more blocks
Benefits: code coverage, unplanned changes,
customer satisfaction, lines of code
Work In Progress
(WIP)
Defined: Number of items we are actively
working on. The higher the WIP, the more multi-tasking hurts our efficiency.
Caution: While a WIP limit of 1 per person
may seem ideal, research suggests it’s closer to 2 per person in the event a
block prevents us from working the highest priority item.
Side effects: If we blindly decrease this metric, we may
see:
·
excessive slack time
Benefits: lead time, defect count, velocity,
ROI, unfinished stories, un-deployed stories, blocks, budget/schedule
compliance, flow efficiency
Code Coverage
Defined: Percentage of production code tested
by the automated regression suite.
Caution: Static and dynamic code coverage
evaluation tools cannot tell us if the code just happened to be executed or if
it was verified for proper behavior. The only strategy for full coverage of behavior
is Test-Driven Design / TDD / BDD. Short of automation, we cannot find
regressions fast enough to keep up with development.
Side effects: If we blindly increase this metric, we may
see:
·
increased failure load
·
decreased velocity
·
reduced innovation
Benefits: lead time, defect count
Unplanned Changes
Defined: Number of unanticipated change
requests we were able to include in this product increment. Since Agile is all
about being more responsive, this is a metric that shows how adaptive we’ve
become.
Caution: Tracking this metric could be
burdensome—what counts as a change request? A font style change on the UI? An increase
in scope? Pick a granularity to track and stick with it.
Side effects: If we blindly increase this metric, we may
see:
·
decreased velocity (churn)
·
excessive innovation (lack of focus)
Benefits: customer satisfaction, return on
investment, lead time
Velocity
Defined: Abstract quantity of work that can
be completed in a given iteration. Velocity automatically accounts for regular
meeting overhead and business-as-usual activities. Velocity is often reported
in units of Story Points, Ideal Days, Ideal Hours, or Story Count. Story Points
tend to encompass effort, doubt & complexity, so they’re packed with more
information than a simple estimate.
Caution: For large organizations, it helps
to normalize Story Points on approximately 1 Ideal Day to simplify strategic
& roadmap level planning. Story Points should not used to evaluate past
performance—they’re only intended for forward planning.
Side effects: If we blindly increase this metric, we may
see:
·
reduced customer satisfaction
·
increased failure load
·
reduced artifacts generated
·
reduced innovation
·
reduced slack time
Benefits: lead time, budget/schedule
compliance, flow efficiency, release burn-up
Return On
Investment
Defined: Percent earnings based on revenue,
capital investment and operational cost.
Caution: Many teams don’t have access to
this data, or don’t track it long enough to see the impact of their work on
ROI. Yet it’s key to justifying investment in software.
Side effects: If we blindly increase this metric, we may
see:
·
reduced innovation
·
reduced customer satisfaction
·
increased failure load
Benefits: lead time, budget/schedule
compliance
Innovations per
sprint
Defined: As an agile team becomes more
cross-functional, the whole team gains a greater appreciation for what the
customer finds valuable. When this results in feature ideas that the Product
Owner selects for the backlog, we consider this a success of the whole team.
Caution: Innovation must be
customer-centric—in Kano’s terms, either a linear feature or an
exciter/delighter.
Side effects: If we blindly increase this metric, we may
see:
·
reduced release burn-up
·
excessive unplanned changes
·
increased lead time
Benefits: customer satisfaction, return on
investment
Artifacts
Generated
Defined: Any document or non-source-code
electronic file generated as a result of the software development process is an
artifact. We may want to track help files generated to get a sense of whether
our development is sustainable.
Caution: Some artifacts were historically
created for visibility into a long development cycle. If you can rely on
automated customer tests instead, this type of “executable specification” will
be demonstrably current.
Side effects: If we blindly increase this metric, we may
see:
·
increased lead time
·
increased work in progress
·
reduced budget/schedule compliance
Benefits: n/a
Slack Time
Defined: Buffer, maintenance, or creative
work that is tangentially related to prioritized product backlog items. Just as
a highway has serious congestion at 80% utilization, we see software teams
loaded above 80% see serious performance bottlenecks.
Caution: Slack time is not vacation or
goofing off. It is one of the only steps in an agile SDLC that consistently
reduces technical debt.
Side effects: If we blindly increase this metric, we may
see:
·
reduced velocity
·
more unfinished stories
·
more un-deployed stories
Benefits: lead time, failure load, innovations
per sprint, customer satisfaction
Failure Load
Defined: Percent of time spent fixing
defects. Failure load is waste; it’s forcing our customers to pay for features
twice. We want to avoid failure load whenever practical. You can’t go fast
without high quality!
Caution: n/a
Side effects: If we blindly decrease this metric, we may
see:
·
reduced velocity
·
reduced innovation
·
more unfinished stories
·
more blocks
Benefits: code coverage, unplanned changes,
customer satisfaction, lines of code
Iteration
Burn-Down
Defined: Bar chart showing hours or story
points remaining per day of the iteration. The trajectory of the bars shows
whether we’re on schedule or not.
Caution: Without small enough stories, teams
will see a “clumping” effect where most of the work tends to get finished at
the end of the iteration. This is not desirable—find ways to get to done
earlier so there is time to make unforeseen adjustments.
Side effects: If we blindly improve this metric, we may
see:
·
increased blocks
Benefits: unfinished stories, work in progress,
return on investment, customer satisfaction, budget/schedule compliance
Unfinished
Stories
Defined: Any story that did not reach the
“definition of done” in the same iteration in which it was begun is an unfinished story. A product owner may
cancel, re-schedule, split, re-scope, or defer such a story.
Caution: Unfinished stories come from a lack
of discipline. There’s always a way to negotiate a good story so that it can be
split or completed this iteration.
Side effects: If we blindly decrease this metric, we may
see:
·
n/a
Benefits: customer satisfaction, lead time,
release burn-up
Customer
Satisfaction
Defined: increased customer retention or
increased revenue
Caution: Learning about customer retention
is slow, and we need safe sandboxes in which to experiment and learn more
quickly (e.g., pilot markets or beta tests).
Side effects: If we blindly increase this metric, we may
see:
·
reduced innovation
·
reduced slack time
Benefits: return on investment
Lines of Code
(LOC)
Defined: One source-code line; from an agile
perspective, a line of code increases the risk of system failure and increases
the cost of maintenance. We seek elegance, clean code, and avoid duplication in
the code base.
Caution: Mature software shouldn’t always
grow. At some point, re-factoring will keep the LOC count stable while we
continue to add features. At the same time, if we make code difficult to read
or understand, we’ll introduce additional risk for system maintainers.
Side effects: If we blindly decrease this metric, we may
see:
·
increased defect count
Benefits: lead time, failure load
Un-deployed
Stories
Defined: stories that have reached a team’s
definition of done but are not yet actually earning money or being used by a
customer
Caution: until a paying customer uses our
product increment, there is risk that delivery teams will need to get involved
in supporting it
Side effects: If we blindly decrease this metric, we may
see:
·
decreased customer satisfaction (a product that
changes too often?)
Benefits: lead time, defect count, work in
progress, unplanned changes, innovations
# Blocks
Defined: The number of impediments that
development teams have asked for help on.
Caution: A large number of blocks may mean
teams aren’t being as proactive as they could be, or they don’t have an
adequate “definition of ready” before accepting work.
Side effects: If we blindly decrease this metric, we may
see:
·
unfinished stories
·
undeployed stories
Benefits: lead time
Budget/Schedule
Compliance
Defined: compare the estimate of a strategic
or roadmap level portfolio item with the team-level estimates (for completed stories
only)
Caution: until a product increment is
considered deployable (a minimally marketable feature), we cannot make any assessment
on its cost
Side effects: If we blindly optimize this metric, we may
see:
·
reduced innovation
·
fewer unplanned changes
·
reduced customer satisfaction
Benefits: increased return on investment,
reduced lead time, reduced work in progress
Flow Efficiency
Defined: flow efficiency = lead time / touch
time; that is, the amount of time to go through the whole system divided by the
actual amount of time someone is actively working on it.
Caution: Flow efficiency highlights wait time
in the existing process, though we really need to focus on value added time.
Use this to identify red flags but only as a secondary method to value-added
optimization.
Side effects: If we blindly decrease this metric, we may
see:
·
excessive slack time
·
excessively limited WIP
Benefits: lead time, return on investment
Feedback
What's missing? What would you change?
5 comments:
Great post! Every metric should come with a warning label, just like medicine you buy. I will be archiving this post for future use! Thank you for sharing this.
-John
Andre - thank you this post is beautiful. I just shared with a CSM class. Their reaction nearly every measure can be gamed and so perhaps we should avoid metrics.
You have shared very nice post. I think you have great collection of it. I think we should avoid the metrics and this is the message which you want to convey by your post. Keep it up.
Great post! I like the framework for the metrics definitions.
The Side effects sections contain indicators for when a metric might be "stopped". It might be useful to reword the first line as follows: "If any of the following begin to be noticed, consider removing this metric for a time."
I am not sure why you do not consider source code an artifact. To me, it is one of the primary artifacts.
Under "Defect Count" the "Caution" section appears truncated.
Your article was one of the inspirations to describe my real-life examples of metrics enriched by decisions based on them. https://www.linkedin.com/pulse/real-life-agility-metrics-visualizations-lead-you-piotr-maksimczyk
Post a Comment