Marco Craveiro: Nerd Food: On Product Backlog

Nerd Food: On Product Backlog

Would be be good to have a better bug-tracking setup? Yes. But I think it takes man-power, and it would take something *fundamentally* better than bugzilla. -- Linus

Many developers in large companies tend to be exposed to a strange variation of agile which I like to call "Enterprise Grade Agile", but I've also heard it called "Fragile" and, most aptly, "Cargo-Cult Agile". However you decide to name the phenomena, the gist of it is that these setups contain nearly all of the ceremony of agile - including stand-ups, sprint planning, retrospectives and so on - but none of its spirit. Tweets such as this are great at capturing the essence of the problem:

Top tip: if you need to bring a notepad to the daily stand up to tell us what you did yesterday that's too many details
— Fran Buontempo (@fbuontempo) January 12, 2016

Once you start having that nagging feeling of doing things "because you are told to", and once your stand-ups become more of a status report to the "project manager" and/or "delivery manager" - the existence of which, in itself, is rather worrying - your Cargo Cult Agile alarm bells should start ringing. As I see it, agile is a toolbox with a number of tools, and they only start to add value once you've adapted them to your personal circumstances. The fitness function that determines if a tool should be used is how much value it adds to all (or at least most) of its users. If it does not, the tool must be further adapted or removed altogether. And, crucially, you learn about agile tools by using them and by reflecting on the lessons learned. There is no other way.

This post is one such exercise and the tool I'd like to reflect on is the Product Backlog. Now, before you read through the whole rant, its probably worth saying that this post takes a slightly narrow and somewhat "advanced" view of agile, with a target audience of those already using it. If you require a more introductory approach, you are probably better off looking at other online resources such as How to learn Scrum in 10 minutes and clean your house in the process. Having said that, I'll try to define terms best I can to make sure we are all on the same page.

Working Definition

Once your company has grokked the basics of agile and starts to move away from those lengthy specification documents - those that no one reads properly until implementation and those that never specified anything the customer wanted, but everything we thought the customer wanted and then some - you will start to use the product backlog in anger. And that's when you will realise that it is not quite as simple as memorising text books.

So what do the "text books" say? Let's take a fairly typical definition - this one from Scrum:

The agile product backlog in Scrum is a prioritized features list, containing short descriptions of all functionality desired in the product. When applying Scrum, it's not necessary to start a project with a lengthy, upfront effort to document all requirements. Typically, a Scrum team and its product owner begin by writing down everything they can think of for agile backlog prioritization. This agile product backlog is almost always more than enough for a first sprint. The Scrum product backlog is then allowed to grow and change as more is learned about the product and its customers.¹

This is a good working definition, which will suffice for the purposes of this post. It is deceptively simple. However, as always, one must remember Yogi Berra: "In theory, there is no difference between theory and practice. But in practice, there is."

Potmenkin Product Backlogs

Many teams finish reading one such definition, find it amazingly inspiring, install the "agile plug-in" on their bug-tracking software of choice and then furiously start typing in those tickets. But if you look closely, you'd be hard-pressed to find any difference between the bug tickets of old versus the "stories" in the new and improved "product backlog" that apparently you are now using.

This is a classic management disconnect, whereby a renaming exercise is applied and suddenly, Potemkin village-style, we are now in with the kool kids and our company suddenly becomes a modern and desirable place to work. But much like Potemkin villages were not designed for real people to live in, so "Potmenkin Product Backlogs" are not designed to help you manage the lifecycle of a real product; they are there to give you the appearance of doing said management, for the purposes of reporting to the higher eschelons and so that you can tell stakeholders that "their story has been added to the product backlog for prioritisation".

Alas, very soon you will find that the bulk of the "user stories" are nothing but glorified one-liners that no one seems to recall what exactly they're supposed to mean, and those few elaboratedly detailed tickets end up rotting because they keep being deprioritised and now describe a world long gone. Soon enough you will find that your sprint planning meetings will cover less and less of the product backlog - after all, who is able to prioritise this mess? Some stories don't even make any sense! The final act is when all stories worked on are stories raised directly on the sprint backlog, and the product backlog is nothing but the dumping ground for the stories that didn't make it on a given sprint. At this stage, the product backlog is in such a terrible mess that no one looks at it, other than for the occasional historic search for valuable details on how a bug was fixed. Eventually the product backlog is zeroed - maybe a dozen or so of the most recent stories make it through the cull - and the entire process begins anew. Alas, enlightenment is never achieved, so you are condemned to repeat this cycle for all eternity.

As expected, the Potmenkin Product Backlog adds very little value - in fact it can be argued that it detracts value - but it must be kept because "agile requires a product backlog".

Bug-Trackers: Lessons From History

In order to understand the difficulties with a product backlog, we turn next to their logical predecessors: bug-tracking systems such as Bugzilla or Jira. This post starts with a quote from the kernel's Benevolent Dictator that illustrates the problem with these. Linus has long taken the approach that there is no need for a bug-tracker in kernel development, although he does not object if someone wants to use one for a subsystem. You may think this is a very primitive approach but in some ways it is also a very modern approach, very much in line with agile; if you have a bug-tracking system which is taking time away from developers without providing any value, you should remove the bug-tracking system. In kernel development, there simply is no space for ceremony - or, for that matter, for anything which slows things down².

All of which begs the question: what makes bug-tracking systems so useless? From experience, there are a few factors:

they are a "fire and forget" capture system. Most users only care about entering new data, rather than worrying about the lifecycle of a ticket. Very few places have some kind of "ticket quality control" which ensures that the content of the ticket is vaguely sensible, and those who do suffer from another problem:
they require dedicated teams. By this I don't just mean running the bug-tracking software - which you will most likely have to do in a proprietary shop; I also mean the entire notion of Q&A and Testing as separate from development, with reams of people dedicated to setting "environments" up (and keeping them up!), organising database restores and other such activities that are incompatible with current best practices of software development.
they are temples of ceremony: a glance at the myriad of fields you need to fill in - and the rules and permutations required to get them exactly right - should be sufficient to put off even the most ardent believer in process. Most developers end up memorising some safe incantation that allows them to get on with life, without understanding the majority of the data they are entering.
as the underlying product ages, you will be faced with the sad graph of software death. The main problem is that resources get taken away from systems as they get older, a phenomena that manifests itself as a growth in the delta between the number of open tickets against the number of closed tickets. This is actually a really useful metric but one that is often ignored.³.

And what of the newest iterations on this venerable concept such as GitHub Issues? Well, clearly they solve a number of the problems above - such as lowering the complexity and cost barriers - and certainly they do serve a very useful purpose: they allow the efficient management of user interactions. Every time I create an issue - such as this one - it never ceases to amaze me how easily the information flows within GitHub projects; one can initiate comms with the author(s) or other users with zero setup - something that previously required mailinglist membership, opening an account on a bug-tracker and so forth. We now take all of this for granted, of course, but it is important to bear in mind that many open source projects would probably not even have any form of user interaction support, were it not for GitHub. After all, most of them are a one-person shop with very little disposable time, and it makes no sense to spend part of that time maintaining infrastructure for the odd person or two who may drop by to chat.

However, for all of its glory, it is also important to bear in mind that GitHub Issues is not a product backlog solution. What I mean by this is that the product backlog must be owned by the team that owns the product and, as we shall see, it must be carefully groomed if it is to be continually useful. This is at loggerheads with allowing free flow of information from users. Your Issues will eventually be filled up with user requests and questions which you may not want to address, or general discussions which may or may not have a story behind it. They are simply different tools for different jobs, albeit with an overlap in functionality.

So, history tells us what does not work. But is the product backlog even worth all this hassle?

Voyaging Through Strange Seas of Thought

One of the great things about agile is how much it reflects on itself; a strange loop of sorts. Presentations such as Kevlin Henney's The Architecture of Uncertainty are part of this continual process of discovery and understanding, and provide great insights about the fundamental nature of the development process. The product backlog plays - or should play - a crucial role exactly because of this uncertain nature of software development. We can explain this by way of a device.

Imagine that you start off by admitting that you know very little about what it is that you are intending to do and that the problem domain you are about to explore is vast and complex. In this scenario, the product backlog is the sum total of the knowledge gained whilst exploring this space that has yet not been transformed into source code. Think of it like the explorer's maps in the fifteen-hundreds. In those days, "users" knew that much of it was incorrect and a great part was sketchy and ill-defined, but it was all you had. Given that the odds of success were stacked against you, you'd hold that map pretty tightly while the storms were raging about you. Those that made it back would provide corrections and amendments and, over time, the maps eventually converged with the real geography.

The product backlog does something similar, but of course, the space you are exploring does not have a fixed geometry or topography and your knowledge of the problem domain can actively change the domain itself too - an unavoidable consequence of dealing with pure thought stuff. But the general principle applies. Thus, in the same way a code base is precious because it embodies the sum total knowledge of a domain - heck, in many ways it is the sum total knowledge of a domain! - so the product backlog is precious because it captures all the known knowledge of these yet-to-be-explored areas. In this light, you can understand statements such as this:

When your product backlog is empty, your product is dead - @KevlinHenney #agileotb
— Marc Johnson (@marcjohnson) September 4, 2014

So, if the backlog is this important, how should one manage it?

Works For Me, Guv!

Up to this point - whilst we were delving into the problem space - we have been dealing with a fairly general argument, likely applicable to many. Now, as we enter the solution space, I'm afraid I will have to move from the general to the particular and talk only about the specific circumstances of my one-man-project Dogen. You can find Dogen's product backlog here.

This may sound like a bit of a cop out, you may say, and not without reason: how on earth are you supposed to extrapolate conclusions from a one-person open source project to a team of N working on a commercial product? However, it is also important to take into account what I said at the start: agile is what you make of it. I personally think of it as a) the smallest amount of processes required to make your development process work smoothly and b) and the continual improvement of those processes. Thus, there are no one-size-fits-all solutions; all one can do is to look at others for ideas. So, lets look at my findings⁴.

The first and most important thing I did to help me manage my product backlog was to use a simple text file in Org Mode notation. Clearly, this is not a setup that is workable for a development team much larger than a set of one, or one that doesn't use Emacs (or Vim). But for my particular circumstances it has worked wonders:

the product backlog is close to the code, so wherever you go, you take it with you. This means you can always search the product backlog and - most importantly - add to it wherever you are and whenever an idea happens to come by. I use this flexibility frequently.
the Org Mode interface makes it really easy to move stories up and down (order is taken to mean priority here) and to create "buckets" of stories according to whatever categorisation you decide to use, up to any level of nesting. At some point you end up converging to a reasonable level of nesting, of course. It is surprising how one can manage very large amounts of stories thanks to this flexible tree structure.
it's trivial to move stories in and out of a sprint, keeping track of all changes to a story - they are just text that can be copy and pasted and committed.
Org Mode provides a very capable tagging system. I first started by overusing these, but when tagging got too fine grained it became unmaintainable. Now we use too few - just epic and story - so this will have to change again in the near future. For example, it should be trivial to add tags for different components in the system or to mark stories as bugs or features, etc. Searching then allows you to see a subset of the stories that match those labels.

A second decision which has proven to be a very good one has been to groom the product backlog very often. And by this I don't just mean a cursory look, but a deep inspection of all stories, fixing them where required. Again, the choice of format has proved very helpful:

it is easy to mark all stories as "non-reviewed" or some other suitable tag in Org Mode, and then unmark them as one finishes the groom - thereby ensuring all stories get some attention. As the product backlog becomes larger, a full groom could take multiple sprints, but this is not an issue once you understand its value and the cost of having it rot.
because the product backlog is with the code, any downtime can be used for grooming; those idle weekends or that long wait at the airport are perfect candidates to get a few stories looked at. Time spent waiting for the build is also a good candidate.
you get an HTML representation of the Org Mode file for free in GitHub, meaning you can read your backlog from your phone. And with the new editing functionality, you can also edit stories too.

Thirdly, I decided to take a "multi-pass" approach at managing the story lifecycle. These are some of the key aspects of this lifecycle management:

stories can only be captured if they are aligned with the vision. This filter saves me from adding all sorts of ideas which are just too "out of the left field" to be of practical use, but keeps those that may sound crazy are but aligned with the vision.
stories can only be captured if there is no "prior art". I always perform a number of searches in the backlog to look for anything which covers similar ground. If found, I append to that.
new stories tend to start with very little content - just the minimum required to allow resetting state back to the idea I was trying to capture. Due to this, very little gets lost. At this point, we have a "proto-story".
as time progresses, I end up having more ideas on this space, and I update the story with those ideas - mainly bullet points with one liners and links.
at some point the story begins to mature; there is enough on it that we can convert the "proto-story" to a full blown story. After a number of grooms, the story becomes fully formed and is then a candidate to be moved to a sprint backlog for implementation. It may stay in this state ad-infinitum, with periodic updates just to make sure it does not rot.
A candidate story can still get refined: trimmed in scope, re-targeted, or even cancelled because it no longer fits with the current architecture or even the vision. Cancelled stories are important because we may come back to them - its just very unlikely that we do.
every sprint has a "sprint mission"⁵. When we start to move stories into the sprint backlog, we look for those which resonate with the sprint mission. Not all of them are fully formed, and the work on the sprint can entail the analysis required to create a full blown story. But many will be implementable directly off of the product backlog.
some times I end up finding related threads in multiple stories and decide to merge them. Merging of related stories is done by simply copying and pasting them into a single story; over time, with the multiple passes done in the grooms, we end up again with a single consistent story.

What all of this means is that a story can evolve over time in the product backlog, only to become the exact thing you need at a given sprint; at that point you benefit from the knowledge and insight gained over that long period of time. Some stories in Dogen's backlog have been there for years, and when I finally get to them, I find them extremely useful. Remember: they are a map to the unknown space you are exploring.

With all of this machinery in place, we've ended up with a very useful product backlog for Dogen - one that certainly adds a lot of value. Don't take me wrong, the cost of maintenance is high and I'd rather be coding instead of maintaining the product backlog, especially given the limited resources. But I keep it because I can see on a daily basis how much it improves the overall quality of the development process. It is a price I find worth paying, given what I get in return.

Final Thoughts

This post was an attempt to summarise some of the thoughts I've been having on the space of product backlogs. One of its main objectives was to try to convey the importance of this tool, and to provide ideas on how you can improve the management of your own product backlog by discussing the approach I have taken with Dogen.

If you have any suggestions or want to share your own tips on how to manage your product backlog please reach me on the comments section - there is always space for improvement.

Footnotes:

Source: Scrum Product Backlog, Mountain Goat Software.

A topic which I covered some time ago here: On Evolutionary Methodology. It is also interesting to see how the kernel processes are organised for speed: How 4.4's patches got to the mainline.

Another topic which I also covered here some time ago: On Maintenance.

⁴

I am self-plagiarising a little bit here and rehashing some of the arguments I've used before in Lessons in Incremental Coding, mainly from section DVCS to the Core.

⁵

See the current sprint backlog for an example.

Created: 2016-01-17 Sun 23:55

Emacs 24.5.1 (Org mode 8.2.10)

Validate

Marco Craveiro

Sunday, January 17, 2016