Factor 1: Business cost
Cost is always critical and often the number 1 factor. Naturally, you want to fix the bugs that are putting a dent in your revenues as quickly as possible. For example, if a bug only manifests on a few orders on an eCommerce site, but they are the biggest orders you get, then it should get high priority.
Factor 2: Reach or Frequency
The second factor to consider is the number of people affected by the bug. If your support team is flooded with tickets from many different customers who encounter the same bug, you’d better jump on that one. Conversely, if only one person reported it, you might save that one for the next COVID-19 lockdown. Another way to look at this is how often the bug is encountered. If it’s several times a day, it goes up the list, but if it was only once, two years ago, like I said, that’s your lockdown project. Note, however, that this one could conflict with the cost factor. You could have a bug that only affects a few customers, but if those are your biggest customers generating 75% of your revenue, then lockdown or not, you’ll want to squash those bugs as quickly as possible.
Factor 3: Process
Factor 4: Alternatives
Depending on whom the bug affects, you might want to consider that there is more than one way to climb a tree. For example, you might circumvent that bug in the office coffee supply software by directing your employees to a mobile app until the coffee is flowing freely again. Your customers, however, may not be so accommodating if the bug is affecting them.
Factor 5: Age
Unlike fine wine, bugs do not age well. The longer a bug exists in your system, the higher the cost of fixing it, for several reasons. For one, more users are going to be affected by a bug the longer it lurks in your system, and some of those users will seek better behaving software elsewhere. Then, as new versions get pushed to production, the bug could be manifested in new and different ways, making it more difficult to perform a root cause analysis. Finally, as new modules get added to your application, they may actually depend on the software with the faulty behavior. Then, when you fix it, you may break those modules.
Factor 6: Reproducibility
Reproducing a bug is usually the first step towards fixing it because, once you can consistently reproduce a bug, it’s much easier to understand it fully and, therefore, to fix it. In any software development organization, developer time is always at a premium. The longer a developer must spend reproducing a bug in order to fix it, the costlier it is to fix that bug. If you have two different bugs that have a similar impact on your business, it makes sense to fix the “easy” one (a.k.a., the one you can reproduce) first. But here’s the catch. You can’t ignore the difficult ones for ever even if they are low impact. As you leave these bugs low on your priority list, eventually, the age factor will come into play and the impact of those bugs will increase and raise their priority. In a way, these six factors are self-adjusting.
These six factors will take you a long way to prioritizing bugs for fixing, and you can even fine-tune this method to some degree by assigning weights to each factor. For example, a bug that doesn’t have a high immediate impact on revenue might get 1 penalty point, while a bug that does might get 10 penalty points. You can even assign different weights to each criterion. For example, the Business Cost factor may carry a multiplier of 5 while the Process factor, a multiplier of 2. By assigning penalty points and multipliers to each factor, you can even go some way to automate prioritizing the bugs. This is one way to do it. The problem is that this way is kind of generic, and every business may have specific criteria with which it prioritizes bugs.
Automating prioritization with contextual data
In a previous post, we showed how Ozcode Production Debugger’s autonomous exception capture removes the reproducibility factor. By capturing the full code execution flow of an error, including calling arguments, local variables, relevant log entries, HTTP and database requests, Ozcode presents all the data a developer needs for a root cause analysis, so there’s no need to reproduce the bug.
With reproducibility off the table, it also means that prioritizing low-impact bugs before they age and become high-impact can now be a valid strategy for your team.
But it gets even better than that. Ozcode’s exception capture includes additional contextual data to help developers debug errors. This includes things like machine name, agent version, and service name so you can understand which business flow was affected, and perhaps even assign the bug to the developer who worked on that business flow.
Ozcode takes contextual data a step further and lets you customize which data is collected so you can include that data when prioritizing bugs for remediation. Let’s consider a few examples.
- When a backend handles an HTTP request, you can augment an error with a user ID. This can give you a better indication of a bug’s reach to see exactly how many (and which) users are impacted.
- When an ordering API processes requests, you can attach a dollar value to each request and, therefore, know the exact value of failed requests.
- Any global enterprise could attach a geo-tag to each application running on its servers worldwide to identify any location-specific problems in the company’s network. Errors detected around the company’s corporate site or its top revenue site would get priority over errors in remote, low-traffic location.
By enabling you to assign rich contextual data to each error in your system, Ozcode Production Debugger provides a way to automate prioritization based on the business cost, frequency, reach, and process factors. This kind of flexibility gives organizations the ability to customize their bug prioritization criteria as a mix of both generic factors and other factors specific to the business. The more context you have around an error, the better you can prioritize it and the faster you can resolve it. By collecting rich contextual data with Ozcode Production Debugger, you can prioritize issues based on real-time metrics that are linked to each line of your code. Stay tuned for a follow-up post about Ozcode’s Nuget API for Contextual Data that specifies how you can customize the data attached to each exception capture in your system. That’s not only data-driven debugging; it’s specific business data-driven debugging.