Flow Metrics For Scrum Teams
Flow Metrics For Scrum Teams
ISBN 979-8-9867724-2-4
The Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Clusters of Dots . . . . . . . . . . . . . . . . . . . . . . . . 81
Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Internal and External Variability . . . . . . . . . . . . . . 84
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Endnotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Chapter 10 - Tooling . . . . . . . . . . . . . . . . . . . . . . . 87
What To Look For In A Tool . . . . . . . . . . . . . . . . 88
Red Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Endnotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Why This Book
This book came about from Slack discussions between Dan & Will
about many of the things Dan disliked about Scrum. He still isn’t
the greatest fan, but he’s improving. We sought to create a useful
guide to more data-driven ways of managing work, and hopefully
help some teams move away from Story Point theater. Mostly, we
hope these tools will help your team spend less time planning and
discussing plans, and more time building great things.
Thanks
The authors would like to thank James Scrimshire for the cover art.
Also, thanks to our proofreaders Colleen Johnson, Stephan Vlieland,
Stas Pavlov, Prateek Singh, and Frank Steeneveld for their feedback.
We’ve tried to remove all the jokes about Jira. We still don’t like it.
Chapter 1 - Let’s Begin
Before you can even think about applying Flow Metrics to your
Scrum implementation, there are a few things that you need to have
in place. Unfortunately, the Scrum Guide is largely silent on these
crucial pieces, so we’ll have to spend a bit of time explaining them
in detail here. To make things a little more interesting, let’s turn the
rest of this chapter into a drinking game. Every time we say “flow”
or “workflow” from now on, take a drink. So go get your favourite
tipple ready and let’s proceed.
What Is Flow?
The whole reason for the existence of your Scrum Team is to
deliver value for your customers/stakeholders. Value, however,
doesn’t just magically appear. Constant work must be done to turn
potential product improvements into tangible customer value. The
steps needed to turn a product improvement idea into something
concrete that our customers find valuable is called a process.
You have chosen Scrum as the framework upon which to build your
process for value delivery. It’s a common misunderstanding that
Scrum itself is a process. It is not. It is a framework within which
you create and continuously improve a value delivery process.
Whether you know it or not, you and your team have built a value
delivery process that goes way beyond Scrum. That process may be
explicit or implicit, but it exists.
We cannot overstate the importance of this concept enough, be-
cause having an understanding of process is fundamental to the
understanding of Flow. Once your process is established, then Flow
Chapter 1 - Let’s Begin 3
Conclusion
Your job as a Scrum team is to deliver value to your customer(s).
By choosing Scrum, you have selected a very specific framework
upon which to implement a process for value delivery.
If you haven’t already, you need to sit down with your team and
decide–for your process–what does it mean for a PBI to have
started and what does it mean for a PBI to have finished. All other
flow conversations will be dependent on your boundary decision.
Once defined, the movement of potential value from your defined
started and finished points is what is called Flow. The concept of
movement is of crucial importance because the last thing we want
as a team is to start a whole bunch of PBIs but never finish them.
That is antithesis of value delivery. What’s more, as we do our work
our customers are constantly going to be asking (whether we like it
or not) questions like “how long” or “how many”–questions which
require our understanding of movement to answer.
That’s where Flow Metrics come in.
Chapter 2 - The Basic
Metrics of Flow
The four measures to track for Flow are:
Work In Progress
The generic definition for WIP in a given flow context is: All
discrete units of potential customer value that have entered a given
process but have not exited. In a Scrum context, the discrete units
of work are called Product Backlog Items (PBIs); so, to calculate
WIP you simply count the PBIs within your process boundaries
as defined above. That’s it: just count PBIs and you will have
calculated WIP.
Your first question might be, “how does complexity fit into the
WIP calculation?” The short answer is it doesn’t. This is probably
the hardest concept to grasp for people who have been taught that
capacity is a function of PBI complexity. It isn’t. There is nothing
in the principles of flow that require you to understand the relative
complexity of items that are moving through your process. We will
explain why this is in the next chapter, but for now we’re going
to ask you to suspend disbelief and just accept that as true. The
very good news is that if you hate estimating in story points, then
you can drop that practice immediately upon the adoption of Flow
Metrics. But more on that a little later.
Your next objection might be, “well if complexity doesn’t matter
then certainly the size of the PBIs does” After all, the PBIs that come
through your process will be of a wide variety of size. How can
you possibly account for all of that variability and come up with a
predictable system by just counting PBIs? While that is a reasonable
question, it is not something to get hung up on. As with complexity
there is no requirement to do any kind of upfront estimation of
size when practicing flow, beyond having a very short conversation
about the Service Level Expectation (explained in Chapter 6) when
you pull an item. But more on this later, when we talk about Sprint
Planning.
If you happen to already be using Kanban in Scrum context, then
it should also be noted that there is a difference between WIP and
Chapter 2 - The Basic Metrics of Flow 10
WIP limits. You cannot calculate WIP simply by adding up all the
WIP limits on your board. It should work that way, but in reality it
does not. This result should be obvious as most Kanban boards do
not always have columns or boards that are at their full WIP limit.
A more common situation is to have a Kanban board with WIP limit
violations in multiple columns–or across the whole board. In either
of those cases simply adding up WIP limits will not give you an
accurate WIP calculation. The sad truth is there is no getting around
actually counting up the physical number of items in progress to
come up with your total WIP.
Bottom line, if you want to optimize Flow but are not currently
tracking WIP, then you are going to want to start. Sooner is better
than later.
Cycle Time
In the previous section we stated that a process has specific arrival
and departure boundaries and that any item of customer value
between those two boundaries can reasonably be counted as WIP.
Once your team determines the points of delineation that define
Work In Progress, the definition of Cycle Time becomes very easy:
Cycle Time: The amount of elapsed time that a work item spends
as Work In Progress.
This definition is based on one offered by Hopp and Spearman in
their Factory Physics book² and, you will note, agrees exactly with
the definition given at the beginning of this chapter. Defining Cycle
Time in terms of WIP removes much–if not all–of the arbitrariness
of some of the other explanations of Cycle Time that you may have
seen (and been confused by) and gives us a tighter definition to start
measuring this metric. The moral of this story is: you essentially
have control over when something is counted as Work In Progress
in your process. Take some time to define those policies around
Chapter 2 - The Basic Metrics of Flow 11
Throughput
We’ve saved the easiest metric to define for last. Simply put,
Throughput is defined as:
Throughput: the amount of WIP (number of PBIs) completed per
unit of time.
Chapter 2 - The Basic Metrics of Flow 13
instantaneous metric–that is, at any time you could count all of the
PBIs in your process to calculate WIP–it is usually more helpful
to talk about WIP over some time unit: days, weeks, Sprints, etc.
Our strong recommendation –and this is going to be our strong
recommendation for all of these metrics– is that you track WIP
per day. Thus, if we would want to know what our WIP was for
a given day, we would just count all the PBIs that had started but
not finished by that date. For Figure 2.1, our WIP on January 5th is
3 (PBIs 3, 4, and 5 have all started before January 5th but have not
finished by that day).
Cycle Time
Cycle Time equals the finished date minus the started date plus one
(CT = FD - SD + 1).
If you are wondering where the “+ 1” comes from in the calculation,
it is because we count every day in which the item is worked as
part of the total. For example, when a PBI starts and finishes on the
same day, we would never say that it took zero time to complete.
So we add one, effectively rounding the partial day up to a full
day. What about items that don’t start and finish on the same day?
For example, let’s say an item starts on January 1ˢ and finishes
on January 2ⁿ. The above Cycle Time definition would give an
answer of two days (2 – 1 + 1 = 2). We think this is a reasonable,
realistic outcome. Again, from the customers’ perspective, if we
communicate a Cycle Time of one day, then they could have a
realistic expectation that they will receive their item on the same
day. If we tell them two days, they have a realistic expectation that
they will receive their item on the next day, etc.
You might be concerned that the above Cycle Time calculation is
biased toward measuring Cycle Time in terms of days. In reality,
you can substitute whatever notion of “time” that is relevant for
your context (that is why up until now we have kept saying track
a “timestamp” and not a “date”). Maybe weeks is more relevant for
your specific situation. Or hours. Or even Sprints. For your Scrum
team, if you wanted to measure Cycle Time in terms of Sprints,
Chapter 2 - The Basic Metrics of Flow 16
Completed Date
03/01/2022
03/02/2022
03/03/2022
03/04/2022
Chapter 2 - The Basic Metrics of Flow 17
Randomness
We’ve saved the most difficult part for last. You now know how
to calculate the four basic metrics of flow at the individual PBI
level. Further, we now know that all of these calculations are
deterministic. That is, if we start a PBI on Monday and finish it
a few days later on Thursday, then we know that the PBI had a
Cycle Time of exactly four days.
But what if someone asks us what our overall process Cycle Time
is? What if someone asks us what our Scrum Team’s Throughput
is? How do we answer those questions?
Our guess is you immediately see the problem here. If, say, we look
at our team’s Cycle Time for the past six Sprints, we will see that
we had PBIs finish in a wide range of times. Some in one day, some
in five days, some in more than 14 days, etc. In short, there is no
single deterministic answer to the question “what is our process
Cycle Time?”. Stated slightly differently, your process Cycle Time
is not a unique number, rather it is a distribution of possible values.
That’s because your process Cycle Time is really what’s known as a
random variable. [By the way, we’ve only been talking about Cycle
Time in this section for illustrative purposes, but all of the basic
metrics of flow (WIP, Cycle Time, Age, Throughput) are all random
variables.]
What random variables are and why you should care is one of those
topics that is way beyond the scope of this book. But what you do
need to know is that your process is dominated by uncertainty and
risk, which means all Flow Metrics that you track will reflect that
uncertainty and risk, and further, that uncertainty and risk will
show up as randomness in all of your Flow Metric calculations.
The broader implication is that once randomness shows up, you
can throw determinism out the window. Once you know you
are dealing with a random process, you are required to take a
probabilistic approach. Thankfully for us, probabilistic thinking is
Chapter 2 - The Basic Metrics of Flow 19
Conclusion
What we have shown here are just the basic metrics of flow to get
you started: WIP, Cycle Time, Work Item Age, and Throughput.
There are most certainly other metrics that you will want to track
in your own environment, but these represent the metrics common
to all flow implementations. If your goals are improvement and
predictability, then these are the metrics that you will want to track.
Endnotes
Conclusion
As we will see in the coming chapters, the true power of Flow Met-
rics for Scrum Teams comes in acknowledging their probabilistic
nature. Not every PBI that flows through your process will take
exactly the same amount of time; not every Sprint will you get
exactly the same number of PBIs to done, etc.
Embracing probabilistic thinking will allow you to make more accu-
rate plans and will ultimately allow you to be more predictable for
customers/stakeholders. But what are we thinking probabilistically
about and what does probabilistic thinking have to do with Flow
Metrics? Funny you should ask…
Chapter 3 - Probabilistic Thinking 26
Endnotes
As you can see from Figure 4.1, across the bottom (the X-axis)
is some representation of the progression of time. The X-axis
represents a timeline for your process. You will notice that Figure
4.1 shows the timeline progression from left to right. This is not
a requirement, it is only a preference. However, all Cycle Time
Scatterplot examples in this book will show a time progression from
left to right.
Up the side (the Y-axis) of your chart is going to be some representa-
tion of Cycle Time. Again, you can choose whatever units of Cycle
Time that you want for this axis: days, weeks, months, etc.
To generate a Scatterplot, any time a PBI completes, you find the
date that it completed across the bottom and plot a dot on the chart
at a height according to its Cycle Time. For example, let’s say a
work item took seven days to complete and it finished January 1,
Chapter 4 - Two Charts 29
In Figure 4.2, the 50ʰ percentile line occurs at eight days. That
means that 50% of the PBIs that have flowed through our process
took eight days or less to complete. So we can say that when a PBI
enters our process it has a 50% chance of finishing in eight days or
less. That is without doing any estimation! (More on this concept
in a later chapter).
Using this same approach, we can calculate any percentile. A
commonly used percentile is the 85ʰ. Again, this line represents
the amount of time it took for 85% of our work items to finish. In
Figure 4.3 below you can see that the 85ʰ percentile line occurs at 15
days. That means that 85% of the dots on our chart are on or below
that line, and 15% of the dots on our chart are above that line. This
percentile line tells us is that when a work item enters our process
it has an 85% chance of finishing in 15 days or less, again, with no
Chapter 4 - Two Charts 31
estimation.
The 50ʰ, 85ʰ, and 95ʰ percentiles are probably the most popular
“standard” percentiles to draw. You may see other percentiles,
though, so we have included Figure 4.4 with a few more.
Chapter 4 - Two Charts 32
Figure 4.4 - 30ʰ, 50ʰ, 70ʰ, 85ʰ, 95ʰ, and Mean Percentile Lines
Before we get into how this chart should be used, let’s quickly go
over the anatomy of the chart so you know what you are looking
at. Unlike the Scatterplot, you can see the whole chart has been
Chapter 4 - Two Charts 35
Conclusion
Tracking metrics on their own is of little use unless we have an
effective way to display that data. The Cycle Time Scatterplot and
WIP Aging Chart are two of a myriad of analytical graphs that
we can employ for data-informed decisions. Its just that these two
will be most helpful as we talk about introducing Flow Metrics
into the Scrum events. For a more detailed discussion of these two
Chapter 4 - Two Charts 36
Endnotes
Flow Metrics can help with answering one of these questions, but
we’re not going to tell you which one.
By the way, this is a doozy of a chapter, so you’ll maybe want to go
make yourself a cuppa before you settle in.
1. More recent data is usually better than less recent (older) data.
It’s possible that you’ve heard that “more data is better than
less data”. That statement is not necessarily true because of
this recency principle. Let’s say you had Throughput data on
your Scrum Team going back 10 years. If more data is better
than less data does that mean you should use all 10 years’
worth of Throughput for your simulation? Obviously not.
But how much recent data do I need to perform a successful
simulation? There is no exact answer to that question, but
the general rule of thumb is a minimum of 10-12 data points
and ideally as much as 20 points as a comfortable minimum.
Before you get too scared, that doesn’t mean you need to run
20 Sprints before you have enough data to perform an MCS.
Remember from Chapter 2 that we recommend using days as
your Throughput unit of time. That means if you are doing
two week Sprints then in two Sprints you should have more
than enough data to get started. (Two Sprints of two weeks
Chapter 5 - Sprint Planning, Part I 42
simulation. There are many others. For more information, see Dan’s
book “When Will It Be Done?”³]
Step 3: Aggregate the results. For each “run” of the simulation, we
are going to track the results in what we are going to call a Results
Histogram. What the Results Histogram is and how to interpret it
will be discussed shortly.
Step 4: Repeat steps 1-3 an arbitrary number of times until you
have a clear picture of what the result set looks like. With each
run of the simulation, the probability distribution as described by
the Results Histogram will become clearer and clearer. For example,
think of the coin flipping example. Do you think you would have a
better idea of probability of getting tails if you flipped two times or
two thousand times? Even so, MCS will converge on a result fairly
quickly so all more runs will do is get you a better looking Results
Histogram. For most Sprint Planning, we’ve found 1,000-2,000 runs
gives pretty good results. Anything much more than 10,000 doesn’t
usually provide much better quality. Try this for yourself to see it
in action.
Interpreting the Results Histogram
As was just stated, the outcome of your Sprint Planning MCS will
be a Results Histogram. This histogram represents the shape of
your risk as it relates to the multiple possible future outcomes;
i.e., the different chances of you getting different numbers of PBIs
to Done. [An interesting aside is that the Results Histogram of a
MCS will usually approximate a Normal Distribution. This is due to
something in probability called the Central Limit Theorem (CLT).
You need not worry yourself with the specifics of the CLT, only
know that in general the Results Histogram will usually resemble
a bell curve.]
So how do we turn our Results Histogram into an answer to “What
can be done this Sprint?”
Since the Results Histogram represents the shape of your risk
associated with possible outcomes, the first thing we need to do
Chapter 5 - Sprint Planning, Part I 44
is segment that risk into what are acceptable outcomes vs. what
are not acceptable outcomes. We can do this by drawing what are
called percentile lines on the histogram.
Before we begin, let us first point out an important principle of
the Histogram that we just generated. If you sum the heights of
all the bars on the chart together you will get the number of runs
that made up the simulation. For example, if we ran 10000 runs
in our simulation, then adding up the heights of the bars in the
resultant chart would give you 10000. The probability of any one
particular outcome, then, is the height of a given bar divided by the
total number of runs.
To illustrate, take a look at the Results Histogram in Figure 5.1
below.
very likely?)
But there is another way to interpret these results. For the sake of
argument, let’s say that we did plan on getting 18 items completed
in this Sprint. If we got exactly 18 items done, then that would be a
good result. But what if we got 19 items done? You see from the
histogram that there is indeed a chance that we could finish 19
items. If we told our Product Owner we could get 18 items done,
but we actually finished 19 items, how do you think they would
feel? My guess is they would consider that a good result. The same
would be true if we got 20 items done, 21, 22, and so on…In fact,
any outcome to the right of 18 (inclusive) would be considered
acceptable as illustrated in the green shaded area of Figure 5.2
below:
tolerable? Again, for the sake of argument, let’s say our team wants
to plan on being right 85% of the time. Simple. All we do is start at
the right of the chart and start adding up the heights of until we get
to 8500 (8500 / 10,000 = 85%). This is shown in Figure 5.3 below:
After seeing this, your first inclination (and certainly the first
Chapter 5 - Sprint Planning, Part I 47
mention the other Flow Metrics) will include data for PBIs that as
you worked on them:
The only thing your historical data won’t include are things that are
impossible to plan for (like pandemics that shut the world down
for several months). But, when they occur, those events would
undermine any forecasting technique.
Your Scrum team encounters risk and uncertainty every day as
you do your work. That risk and uncertainty is captured in the
historical record in terms of your ability to get PBIs to done –
namely, your Throughput data. Choose your historical input data
right and chances are that you have covered the majority of the
factors that could affect your Sprint Plan. And on top of that, you
will still have the added flexibility of a Sprint Goal allowing for
changes in whatever you’ve pulled (to some degree).
And that’s not even the best bit (although it really is). The best bit
is that with the right tool, an MCS can be performed in minutes as
opposed to estimating complexity which could take hours.
Speaking of complexity and sizing, there is still one aspect of Sprint
Planning that we need to cover. But as this chapter has gone on long
enough, we think it’s best that we pause and regroup before we talk
about the critical subject known as Right Sizing.
Chapter 5 - Sprint Planning, Part I 50
Endnotes
Right Sizing
“The Scrum Team may refine these items during this process, which
increases understanding and confidence.”¹ The question from most
Scrum Teams is “how much refinement is enough refinement”?
Chapter 6 - Sprint Planning, Part II 52
Tooling
There are a lot of tools out there to help you use MCS for Sprint
Planning, and if you are familiar with Dan’s work, you know he
has a very strong bias to one tool in particular.
Having said that, another strong recommendation we have for you
when starting out with MCS is to to do a couple of models manually
yourself. Using the four step algorithm above, it is a straightforward
exercise to build out a spreadsheet template to do random selections
and create a results histogram. We make this recommendation for
two reasons:
Conclusion
We want to know what (how many) PBIs can be done in the next
Sprint. A solution to this problem is straightforward: feed historical
Throughput data into an MCS and interpret the results to make
probabilistic forecasts about what can be done.
The fundamental assumption when using MCS for Sprint Planning
is that the future that we are trying to forecast roughly looks like
the past that we have data for. One of the big advantages of using
Scrum is that if you are following the framework professionally,
Chapter 6 - Sprint Planning, Part II 56
then you are generally safe (pun intended) to use past data for
future predictions.
In a very short amount of time you can get a very accurate forecast
of what can be done in the Sprint thus allowing you to quickly focus
all of your attention on what is important–actually doing the work.
Which is the very topic we will cover next.
Endnotes
The SLE
In both the MCS Results Histogram and the Cycle Time Scatterplot
we used percentiles to segment data along lines of uncertainty. In
MCS, those lines helped us understand the risk associated with
getting multiple PBIs to done. The Cycle Time Scatterplot is a
little different in that each dot on the chart represents how long
it took for a single item to get to done. The percentile lines on
the Scatterplot then represent the risk associated with getting an
individual PBI to done.
Understanding uncertainty at the individual PBI level is so impor-
tant because it is at the individual PBI level where Flow actually
happens. As individual items move through our process, we need
Chapter 7 - The Daily Scrum 59
to know if they are flowing the way we expect them to. Like MCS,
that expectation is based on how much uncertainty we are willing
to live with in terms of how long it should take for a single PBI to
flow from started to finished in our process.
To illustrate this point, think about it this way: In a two week Sprint,
do you think 100% of PBIs will always complete within 14 days?
We know this is impossible. So what percentage of time are you
willing to be wrong? 85%? 70%? Something else? This is where the
percentiles on the Cycle Time Scatterplot come in, because those
percentiles will tell us our percentage chance of finish a PBI with a
given range of time.
Take a look once more at the percentiles example from Chapter 4:
Figure 7.1 - 30ʰ, 50ʰ, 70ʰ, 85ʰ, 95ʰ, and Mean Percentile Lines
Right Sizing
Read Don Reinertsen’s “Principles of Product Development Flow”
² book and you will quickly realize that one one of the biggest
detriments to flow is working on items that are too big. In flow
terms, that means controlling batch size. We saw earlier that
usually when a PBI is stuck in your process it is because it is too
big–it hasn’t been right sized.
Right sizing is the art of enabling PBIs to flow in small batches of
value. This means breaking things down into small, manageable–
but still valuable–chunks.
We talked about the practice of right sizing in the last chapter, but
as a quick review, Prateek Singh communicated this guidance to
one of his teams (he used User Stories for PBIs):
“The Cycle Time Scatterplot for our team showed that 85% of the
stories that we work on get done in 11 days or less. This is a guide for
right sizing. Whenever the team picks up the next story, they should
ask themselves the question, “is this the smallest bit of value and
can it get done in 11 days or less?’ If the answer to those questions
is yes, great, no more refinement needed, start work on it. If the
answer is no, let us try to break this story down. This is the essence
of right sizing. Each team will figure out their right-size stories from
their own data.”
This is all well and good, you may wonder, but how do we go about
breaking items down once we recognized they haven’t been right
sized? We’re glad you asked.
In this case each of these ACs can be a separate PBI. They can
all independently be delivered to customers (internal or external)
for us to get feedback. Each of these ACs starts solving a customer
problem and delivers value without being held up for the others to
finish.
Many teams we’ve encountered over the years serve very big
customer bases with their products, whether these are software
product, robots or HR policies. Regardless of the product, many
of these teams talk about “the customer” in a very generic, all
encompassing way. Sometimes even with a hint of pride: “our
customers are all the office workers worldwide”. While this sounds
great for the press and future investors, it doesn’t allow for much
focus in the teams. More importantly, it increases the size of the
work. But we can also use our knowledge of our customers to break
up work again.
Consider the following PBI:
“Enable secure login for employees.”
We might break this up in multiple ways for different customers:
The list can go on and on. By being more specific about our cus-
tomers, we can deliver things quicker to those customers, allowing
us faster feedback. We’ve found this works as well for internal as
external customers. Not all customers are equal, and we can use
that to our advantage.
Assumptions
Ok, this goes into a topic Dan is less comfortable with: Outcomes.
But as described by Jeff Gothelf and Josh Seiden in the book Lean
UX³ “Each design is a proposed business solution — a hypothesis.
Your goal is to validate the proposed solution as efficiently as
possible by using customer feedback.” In other words, as we said
before in this book, all PBIs are an assumption of value until
validated by the customer.
If we want to break up a PBI then, one thing we can ask is
“what are all the assumptions we’re making about the value of
this thing?”. And following that, we could break up the item into
its constituent assumptions and deliver those separately. Taking
the example from before, we might break up the item to research
individual assumptions such as:
We can keep going, but hopefully you get the point. All of these
PBIs can be delivered a lot faster, with faster feedback, than the
vast starting point of “Enable secure login for employees”.
The simple heuristic for Daily Scrums is–all other things being
equal–you want to focus your attention on the oldest PBIs first. To
optimize flow, we want to minimize Age which means the oldest
PBIs on your Aging Chart are good indicators that something is
wrong. The last thing you want to do is ignore old items because
all that is going to do is make those PBIs older.
Conclusion
The Daily Scrum is perhaps the Event most actively impacted by
using flow etrics. While the Sprint Planning is dramatically shorter
with this tool, the nature of that Event doesn’t change all that much.
You still figure out a Sprint Goal and try to forecast what work can
be delivered to reach that goal. With the Daily Scrum however, we
see a change. The goal of the Daily Scrum remains the same: Inspect
progress towards the Sprint Goal. But with the use of Flow Metrics,
the conversation changes from “how do we work together today
and help each other reach the Sprint Goal” (person focused) to “how
old is this stuff, and how can we get it moving” (work focused).
Using Flow Metrics in your Daily Scrum actually encourages more
team work by focusing on the work to be done. Because when
the work moves and gets done, you get the feedback you need to
actually inspect your progress towards the Sprint Goal, not just the
stuff you think you need to deliver.
Chapter 7 - The Daily Scrum 71
Endnotes
1. Brooks, F. (1995). Mythical Man-Month, The: Essays on Soft-
ware Engineering, Anniversary Edition (Anniversary ed.).
Addison-Wesley Professional.
2. Reinertsen, Donald G. “The Principles of Product Develop-
ment Flow”. Celeritas Publishing, 2009.
3. Gothelf, J., & Seiden, J. (2021). “Lean UX: Designing Great
Products with Agile Teams (3rd ed.)”. O’Reilly Media.
4. Constable, G. (2021, April 8). “The Truth Curve and the Build
Curve”. https://giffconstable.com.
Chapter 8 - The Sprint
Review
Using Flow Metrics in a Sprint Review can easily derail the con-
versation into a scope discussion. On the other hand, using Flow
Metrics wisely can open up the conversation with your stakehold-
ers on timing, budgets and outcomes. Regardless, Flow Metrics
allow you to switch the conversation to a more future-facing one.
Many teams still struggle with running “demos”, with everyone
present (correctly) feeling like this is a waste of time. When this
isn’t due to a simple lack of knowledge on what the Event should
be about, it’s mainly due to the fact they can’t really say anything
about the future. Well, not anymore. Because now you’ve got
measures, charts, and answers to the most pressing questions of
your stakeholders.
Let’s talk about what all of this could look like.
When Is It Done?
Dan wrote a book about this. That was a pretty smart move, consid-
ering how prevalent this question still is in modern organizations.
There’s a big can of worms here that we’re not going to open with
regards to the question of whether or not “it” will ever be “done”.
If you’re into that kind of stuff, we recommend you look at other
sources. Sometimes it’s a valid question. Things like legislative
requirements or contractual agreements on functionality certainly
have clearly defined scope that needs to be met. Most time it isn’t
though, as the complexity of work means you have no idea if
feature A will result in outcome A until after you’ve built it.
Chapter 8 - The Sprint Review 74
All that said: If you can’t answer this questions, regardless of its
validity, people get very nervous. Fortunately, we can answer this
question somewhat using our trusty Monte Carlo technique. The
technique for a Monte Carlo: When probabilistic forecast is a little
bit different from the Monte Carlo: How Much technique, so we’ll
go over it quickly.
Take an amount of items you want to get done. This can be an
arbitrary amount (10, 50, 100, etc) or whatever amount is in your
Product Backlog. Let’s take 10 for this example. Now look at your
historical daily throughput numbers, we’ll use these for the forecast.
Let’s imagine we have a history from the last 6 days of delivering
2, 1, 2, 2, 0, 1 items per day. Going forward, we can use these as
possible throughputs, and see how much time it’d take, like so:
Using this technique with your own data, you can give a proba-
bilistic answer to the When Will It Be Done” question. We can’t
guarantee your stakeholders will like the answer though. Nerves
calmed, you can then talk about whether or not all of those items
are needed, what risk level people are comfortable with, and if
there’s enough budget or time left to even get you to the projected
dates.
it could be way less than your stakeholders were hoping for, and
you’ll be in for a nice budget conversation. Or a shouting match.
Here’s an example of what a How Many projection could look
like in a Sprint Review, if you were to project out from 60 days
(because that’s when we’re out of money). Note that this uses a
more complete dataset then the earlier example.
Now What?
Of course, everything up to this point is related to output. And if
output was all that mattered this would be the end of the conver-
sation. But Scrum Teams aim to deliver value. Your Sprint Review
should be focused around using the insights from your Sprint, feed-
back on your Product, value measures and marketplace inspection
to update your plan (your Product Backlog). Flow Metrics are meant
to help you in this process, not replace it. Probabilistic forecasting
can give you insight as to how many items might get done, but your
Sprint Review is meant to figure out what those items could be.
Never forget though to always remind your team and stakeholders
of the percentages associated with your forecasts. There are no
guarantees in a complex environment, so your projections might
(and probably) will change over time. So make these conversations
and measures a standard part of your Sprint Reviews.
So, with Flow Metrics as your foundation, explore that value!
Conclusion
Up until now, we’ve only talked about using Flow Metrics as a
Scrum Team for controlling your workflow. This chapter explored
the use of Flow Metrics beyond your Team, and as a part of
Product Management. And we’ve hopefully convinced you of the
tremendous value they can offer in improving your relationship
with your stakeholders. From offering a data-based answer to the
“When Will It Be Done” question, to being able to forecast how
many items could fit within a certain amount of time: Your data will
allow you to elevate the conversation, skipping the output-related
questions and moving you quicker to value.
Chapter 8 - The Sprint Review 78
Endnotes
The Triangle
A triangle-shaped pattern as shown in Figure 9.1 will appear in any
situation where Cycle Time increases over time.
Notice how the dots in the above Scatterplot form a pattern that
looks something like a triangle. Explaining this phenomenon is go-
ing to require us to review the fundamental property of Scatterplots:
dots do not actually show up until a work item has finished. The
items that have longer Cycle Times are going to need an extended
period before they appear on the chart. That means that the longer
the Cycle Time (the dot’s Y-component) the longer the amount of
time we are going to have to wait (the dot’s X-component) to see
that data point.
The triangle pattern appears whenever you start with a process with
zero WIP. This is because it takes time to “prime the pump” and get
items to done. Obviously, in those early stages work will be pulled
in faster than it departs—even if we are limiting WIP. We are going
to need time for each workflow step to fill up to its capacity and get
a predictable flow going. This pattern is exacerbated in situations
Chapter 9 - The Sprint Retrospective 81
where teams feel they have to empty the process at regular intervals
too. In this case what you will see over time is a repeated triangle
pattern, which shows an issue with flow. An example of what we’re
talking about is shown in Figure 9.2:
While the timeframes of both Figures 9.1 and 9.2 are much longer
than a Sprint you can still see how this pattern could emerge over
time in Scrum, given what we just explained above. From a flow
perspective Figure 9.2 is terrible: the team started on everything
around the same time, showing massive WIP. From a Scrum
perspective, it just means that this pattern should be discussed
in the Sprint Retrospective. As an exercise for the reader, what
questions might you ask if you see the pattern in Figure 9.2 emerge
in every Sprint?
Clusters of Dots
The second type of pattern that might emerge is an obvious cluster-
ing of dots on your Scatterplot. Consider, for example, the following
chart in Figure 11.3:
Chapter 9 - The Sprint Retrospective 82
Gaps
Gaps in the dots on your Scatterplot means that no work items
finished in that particular time interval:
In short, the lines that you see on Figure 9.4 are indicative of
batch transfer. It is not uncommon for a Scrum team to generate
a Scatterplot that looks like Figure 9.4. In this example it is quite
obvious that the stacks of dots that you see are at Sprint boundaries–
probably when there is a mad rush to complete PBIs before the end
of the Sprint. But look at how the data thins out between those
stacks. Is this a good thing or a bad thing? Is this even how Scrum
is supposed to work? Either way, what impact is this having on our
predictability? If you think it is a bad thing, what might you do
change that?
Chapter 9 - The Sprint Retrospective 84
Conclusion
Data without action is meaningless. The purpose of your Sprint
Retrospective is to improve your overall process over time. Metrics
are just one, but important, piece of that.
As you go into your next Sprint Retrospective, remember that your
policies shape your data and your data shape your policies. Your
data is telling you the story of your process. Are you listening?
Endnotes
team. Does this mean you should pick tools adhering to common
standards with APIs for accessing data? Obviously. - Will]
Data
First and foremost, your tooling must generate good data. Most
other things you can do with discipline, but if you need to manually
keep timings on items that’s going to be a good chunk of your day
gone. We’re serious here: Will once coached a team that had to
work with a ticketing system that only tracked the time the item
was created (not pulled), closed and since the item last moved, for
some reason. That meant that for Cycle Time purposes someone
from the team had to manually log what was in what step of the
workflow once a day. Took about an hour. Each day. So yeah, find
a tool that keeps time accurately. So what do we mean by that?
It should at the very least:
• Logs the name of the item (so you don’t have to cross-
reference)
• Logs the time the item spent blocked.
• Logs item type as an attribute (so you can filter).
Products and their associated workflows will change over time for
all the reasons that make work complex. Sometimes anticipated,
most times due to sudden insights and influences. This means
that your tool should be flexible enough to support fast and easy
Chapter 10 - Tooling 90
WIP limit can include (but is not limited to) work items in a single
column, several grouped columns/lanes/areas, or a whole board.”.
This means your tooling should allow you to define WIP limits
in whatever way (or combinations of ways) you please. Examples
include:
• Per column
• Per item type
• Per lane
• Per person
• Per release
• In total
• (Any combination of) all of the above
On top of that, you’d want your tool to also warn you in some way
when you’re going to break them, or even prevent it. Many tools
will happily and silently allow you to keep pulling more work, with
the only feedback being a tiny “(!)23/7” on top of the column. That’s
not much of a warning, is it? If your team is serious about flow, you
might want to outright prevent limits from being broken, or require
extra effort for each item you want to pull over the limit. Perhaps
even with a warning to an internal Slack or Teams channel.
Again, this is not meant to be a substitute for team discipline. But
support from a tool can make this discipline a lot easier to maintain.
Exit Criteria
Done that isn’t specific to any particular column. But ideally, your
Definition of Done should be spread out and visible throughout
your workflow, in an easy to absorb format. Not many tools support
this, but those that do make the lives of Developers a lot easier.
Red Flags
Having gone through the features you’re looking for, let’s now also
look at some red flags you want to be on the lookout for. These will
range from instant dismissal to “you can use the tool, just avoid
this feature”. We again realize not every Team has the luxury of
picking their own tool [grumble - Will], but even in that case this
list should provide some talking points with the powers that be.
Mandatory Estimates
Mandatory Assignments
Averaging Graphs
At this point in reading this book you’ll have seen quite a few
graphs. Graphs are nice, when they visualize the right things.
Graphs showing averages don’t do that. Understanding probabilis-
tic forecasting is hard enough for team that works with them
daily, you don’t need the added distraction of meaningless averages.
Worse still, your stakeholders might see it and start developing the
wrong expectations. All in all, it’s better to have a tool that produces
no graphs (as you can use other tools for this, including Excel) than
a tool that shows bad graphs.
Chapter 10 - Tooling 94
Conclusion
Tools can be a great help or a great hindrance. We hope this chapter
provided you with some good criteria to use when choosing or
evaluating a tool. A final tip we’d like to share is to start simple.
Don’t rely on a tool to compensate for a lack of knowledge or
discipline in a team. Get to a point where your team has achieved
Flow in their delivery, and find a tool that supports that Flow.
Endnotes
Starting Steps
Step 0: Monitor Work Item Age in your Daily Scrum
The one big takeaway from this book is if you want to introduce
Flow Metrics to your Scrum Teams, the place to start is with
Work Item Age. And the time to look at Age is during your Daily
Scrum. If you get Aging right, then everything else will fall into
place. However, it is a bit disingenuous to say “start with Age”
because chances are if you have been doing a more traditional
implementation of Scrum that you don’t even have the basics in
place to monitor Aging. So what follows are some steps you may
need to do first in order to layer in Aging and other Flow Metrics
into your Scrum-based process.
Define start and finish points
Chapter 11 - Getting Started 96
or down based on what your data is telling you. The time to have
that conversation? You guessed it: the Sprint Retrospective.
Conclusion
There are many ways to get started with Flow Metrics in Scrum.
This chapter provides a blueprint of the approach that we have
seen work. The majority of the work to get started is around
understanding your current activities and defining your workflow
and policies. Once we have a common understanding of these, we
can start watching the reasons why items age in our system and
make the appropriate adjustments.
It’s up to you from this point forward. Use your data, inspect and
adapt your workflow as needed, build great things. Good luck!
Bibliography
Bertsimas, D., D. Nakazato. The distributional Little’s Law and its
applications. Operations Research. 43(2) 298–310, 1995.
Brooks, F. (1995). Mythical Man-Month, The: Essays on Software
Engineering, Anniversary Edition (Anniversary ed.). Addison-
Wesley Professional.
Brumelle, S. On the relation between customer and time averages
in queues. J. Appl. Probab. 8 508–520, 1971.
Coleman, John and Vacanti, Daniel S. “The Kanban Guide”
https://kanbanguides.org, 2020.
Constable, G. (2021, April 8). The Truth Curve and
the Build Curve | giffconstable.com. Giffconstable.Com.
https://giffconstable.com/2021/04/the-truth-curve-and-the-build-
curve/
Deming, W. Edwards. The New Economics. 2nd Ed. The MIT Press,
1994.
Deming, W. Edwards. Out of the Crisis. The MIT Press, 2000.
Glynn, P. W., W. Whitt. Extensions of the queuing relations L = λ
W and H = λ G.
Operations Research. 37(4) 634–644, 1989.
Goldratt, Eliyahu M., and Jeff Cox. The Goal.
2nd Rev. Ed. North River Press, 1992.
Gothelf, J., & Seiden, J. (2021). Lean UX: Designing Great Products
with Agile Teams (3rd ed.). O’Reilly Media.
Heyman, D. P., S. Stidham Jr. The relation between customer and
time averages in queues. Oper. Res. 28(4) 983–994, 1980.
Bibliography 99