Hardly any people in Tech like when there's a lot of tech debt. And most of us would like when there's not too much of it. But how do we understand how much exactly we have of it? Where exactly does it sit? Which part of it is actually the most annoying? What would be the benefit for us if we spend time getting rid of it? When it comes to planning how you tackle your tech debt, all these questions deserve answers. Especially when we're asked about the ROI on our efforts to eliminate some annoying legacy stuff and build a new shiny module or service. Also, when we work on tech debt, we do want to tackle the most impactful parts of it first, don't we? This talk is about all of that: how we measure our tech debt, how we interpret the results of these measurements so that they give us the answers to the right questions, and how we guide our decision making with those answers.
A Quick and Complete Guide to Measuring Your Tech Debt and Using the Results
AI Generated Video Summary
This Talk discusses the measurement and interpretation of tech lead, focusing on tech debt. Tech debt is a tool to temporarily speed up development but can have negative consequences if not managed properly. Various tech debt metrics, including heuristic metrics and second-tier metrics, can help identify and manage tech debt. Tech debt interest is crucial for measuring the impact of tech debt and allows for prioritization. It is important to collect and analyze tech debt metrics to ensure software and team health.
Hi, everyone. My name is Anton, and thank you for having me at TechLeadConf this year. In the next 20, maybe 25 minutes, I'm gonna be talking about measuring tech lead and interpreting the results. I hope this talk will be useful and interesting for you. I lead engineering organizations and mentor other engineering leaders. Feel free to contact me for mentorship or coaching. Let's dive right in.
Hi, everyone. My name is Anton, and thank you for having me at TechLeadConf this year. And in the next 20, maybe 25 minutes, I'm gonna be talking about measuring tech lead and interpreting the results. I hope this talk will be useful and interesting for you.
The first couple of details about me, I lead engineering organizations. I am a director of engineering at Westing in Germany at the moment. And also, I mentor and coach other engineering leaders like engineering managers, staff principal engineers. So if you're looking for a mentor or for a coach, feel free to contact me, I'm always open. I'm also outside of work, I'm a father and a big fan of mountain skiing and hiking amongst other things. Down below you can see my Twitter handle and my contact info, like, email address and also the link to connect with me on LinkedIn. And yeah, with that, let's dive right in.
2. Understanding Tech Debt
Tech debt is not inherently bad but depends on various factors. It is a tool to temporarily speed up development by cutting corners, much like taking a loan. To determine if we have tech debt, we can analyze our software and answer questions such as ease of analysis, modification, and safety. There are two types of tech debt: maintenance tech debt, which slows down changes due to imperfect design, and continuous tech debt, which requires time to keep the application operational. Continuous tech debt becomes bad when deployed to production, leading to bugs, downtime, security breaches, and lost revenue. Therefore, it is important to manage tech debt to avoid negative consequences.
Let's start with a shocker. What have I told you? Says our dear friend Morpheus, that not all your tech debt actually needs fixing. I fully support this statement, actually, oh, this question rather. And to understand how's that, how's that not all tech debt needs fixing and also how does it even contradict a relatively popular belief that tech debt is generally the root of all evil, let's look at a few things.
So starting with actually what is tech debt? If not, the root of all evil. It was the term coined by this guy, Ward Cunningham in 1992. And he was much younger at that time because this is a screenshot from his explainer video on YouTube that he posted in 2009. And the video is called Tech Debt Metaphor. And he's talking in depth, at large about it. And here are a couple of quotes, which we're not gonna read. We'll just skip over them. So just, according to Ward Cunningham, the author of the term tech debt, it is not the root of all evil. And what it is, it is a tool to temporarily speed up development in that we choose to cut corners to speed up now at the expense of slowing down later. Much like taking a loan, we can afford something earlier than we earn it fully, and then for that we have to pay interest. So since it's a tool, is tech debt inherently bad? Well, let's ask our dear friend Morpheus and he will tell us that the correct answer is it depends. So, as to many questions in the software engineering domain, well, how does it depend? That's the interesting part, right? What does it depend on, like how can we see what it depends on? Well, first, let's look at how we tell if we have tech debt in the first place. That we can do by looking at our piece of software and answering a number of questions. First, is it easy to analyze and understand? This piece of software. Second, is it easy to modify? Third, is it safe to modify? And finally the fourth question, are there any needed technical requirements, for example, scalability, stability, security requirements that aren't implemented? And if we have a yes to any of these questions, we do have tech debt in this piece of software.
Additionally, what I like to distinguish between are two types of tech debt. The first being maintenance tech debt. This is the tech debt related to the first three questions, which is basically the part of tech debt that is slowing down our changes, be it features or be it other types of changes in our code base. We're just implementing them slower because of the tech debt, because the design is not perfect. And the second part of tech debt is continuous tech debt. Which means that we need to spend some time, because of some tech debt in our application, we need to spend some time on keeping the application operational. And this tech debt actually shrinks the bucket of time that we can spend on introducing changes into our code base. So this is how these two types of tech debt are different, and this is how they differently define the answer to the question, when does tech debt become bad actually? Because not all of it is inherently bad, but when does it become bad? Well, continuing tech debt is immediately bad as soon as we deploy it to production. So any shortcuts, reliability, scalability, security shortcuts produce some first order consequences like bugs, downtime, security breaches, god forbid stolen data, which then in turn all produces ad hoc effort, context switching, which is, as we know, another popular productivity burner. Sometimes lost revenue, sometimes things that are even worse than that, like long term reputational damage and so on and so forth. So we don't want that.
3. Understanding Tech Debt Metrics
Continuous tech debt is immediately bad, while maintenance tech debt becomes bad when changes are introduced in the affected area. Tech debt metrics, including heuristic metrics, second tier metrics, and tech debt interest, help identify and manage tech debt. Heuristic tech debt metrics, such as cyclomatic code complexity and code duplication, are automated and provided by existing tools. These metrics can be divided into maintenance tech debt and continuous tech debt. However, converting these metrics into the actual work required to address the tech debt can be challenging.
And continuous tech debt is immediately bad and it is posing all of these risks on our product. Now, maintenance tech debt only becomes bad when you need to introduce changes in the area where you have it, which makes sense. Because if something is working, you don't care about how bad the design of it is as long as it's working and you don't need to introduce any new requirements there. For instance, we're in the microservices world, right? So, it can be a simple microservice, poorly designed, made 10 years ago, that sits there and just works and has worked ever since because there were no further requirements to implement and we did not need to care about the tech debt that sits in there.
What that all means is that the main questions about tech debt to ask are actually, how do we tell which tech debt needs fixing, at least right now and also how do we tell when the tech debt that needs fixing is getting out of hand so that our backlog of the tech debt that actually needs fixing short term is getting too big to fix short term? And the answer to that is in the tech debt metrics or tech debt related metrics, which I like to divide in three main buckets, namely heuristic metrics, second tier metrics, and the bucket that only contains one, tech debt interest, and you will see why it's so special. Spoiler alert, it is special. So let's start one by one.
Heuristic tech debt metrics. When we hear the word heuristic, it's usually about something automated, right? And these metrics are no exception, they are automated, they are usually provided by the tooling that already exists. And most of this tooling is measuring things like cyclomatic code complexity, code duplication, code smells, another thing that was first popularized by the guy named Ken Beck in 1990, and then hugely promoted in the book Refactoring, which many of you may know about, written by Martin Fowler in 1990... published in 1999 actually, sorry. Then there would be something like Maintainability Index, which can have a different name, but be generally the aggregation of the above metrics, and something else potentially. Then there would be TAGDAT Ratio, which is this ratio of TAGDAT Remediation Costs divided by Development Cost. Unfortunately, despite having cost here and this metric allegedly being about direct business impact in money and so on, this cost is too synthetic and too inaccurate, therefore, because usually it's determined by the number of lines of code you have in your codebase, multiplied by some synthetic quotient, which is the cost of developing a line of code, which you can imagine can vary depending on the line and it's rarely a good metric to show the actual effort. Then there would be two more metrics, something like statically or heuristically detectable security issues and also heuristically detectable potentially missed edge cases, which are especially important in loosely typed languages where we don't have compilers to detect those cases. And finally, these metrics are also dividable in those two buckets that we talked about previously, maintenance tech debt and continuous tech debt. It's just that continuous tech debt here is not something that takes us by surprise but rather something that we can detect already while analyzing the code and potentially fix, which is generally a good idea.
Now, we mentioned tools, right? So these are the tools that I just put there off the top of my head. The biggest ones probably, so SonarQube step size and then Code Climate Quality, I believe it's called. And there are other tools, CLI tools, tools with the UI and what have you. So let's talk about the pros and cons of heuristic tech debt metrics. Among the pros, there will be the ease to get the numbers or the full code base at once. Basically once you choose the tool and you set it up, you'll get these numbers with all the, you know, necessary split across modules, folders and so on, in minutes, right? Then there will be ease to segment metrics. So get the split by module or what have you that I mentioned previously. And this would be useful to detect potential hot spots. So where the metrics are showing more tag debts than in other places, which would be some spots that you potentially want to take a closer look at. Now, there will be cons, obviously. First would be that it's hard to convert these metrics into the amount of the actual work that the tag debts behind them requires, because, I don't know, take cyclomatic complexity, you know that in this class, it's like 15, or in this method, it is 15. What does it give you in terms of the effort to fix it? Practically nothing. So you will still need to look into it and estimate, interpret it somehow.
4. Understanding Tech Debt Metrics (2)
Then, it's hard to convert tech debt metrics into business impact. Prioritizing hotspots based on these numbers alone is insufficient. Second-tier tech debt metrics include effort split, cycle time, bug trends, software uptime, and mean times. These metrics are directly connected to business impact, allowing for prioritization. However, they are generic and challenging to connect with specific code modules. The third bucket introduces the concept of tech debt interest, focusing on the ticket level to understand its impact.
Then, it's also hard to convert them into business impact. Again, cyclomatic complexity of 15, what business impact does it have? How much does it slow us down, or does it slow our feature development? We actually don't know. We need to look into it and estimate. It's also really hard to prioritize between hotspots, because, again, we don't know the business impact of those hotspots, and that's why more tag debt by these numbers doesn't always mean more business impact. So this prioritization, with these numbers alone, doesn't give us much. So that's it about the heuristic tag debt metrics.
Let's look into the next bucket. Second-tier tag debt metrics, which are actually not exactly tag debt metrics, but rather the metrics of what it influences and or causes. Things like effort split between ad hoc support and features. You can define a threshold there. For instance, everything beyond 30% spend on ad hoc support would potentially be problematic. For some teams, it has to be even less to be problematic. So you define the threshold, but it's important to track that. The cycle time feature tickets. Feature tickets because usually we optimize for features in nearly every team because we want to introduce changes as fast as possible because we then improve our product and improve our business KPIs, revenue, and so on, and we want to introduce them as fast as possible. So if we're track cycle time, we will see if we're slowing down, and we don't want to. Then there will be bug trends. If we open more bugs than we close consistently over a period of time, then we may get in trouble because there will be only bugs that we will be working on, and then the effort split between ad hoc support and features will be too concerning. Then there will be software uptime, a pretty clear metric, and all sorts of mean times, like mean time to recover from an outage, mean time to detect an outage, and mean time between failures, aka outages, which basically shows us like how stable we are, and how much time it takes us, or like how efficient we are in terms of detecting the outages and recovering from them, because all outages are potentially a lost revenue, and, yeah, we want to handle them quickly.
Now let's look at the pros and cons. The biggest pro of this bucket of metrics is that they are directly connected to the business impact, which means that they give us leverage to prioritize tech debt related changes against business features, which is what we want, right? We come to our business stakeholders, we say, okay, here's how much business impact this tech debt elimination will bring, and they say, okay, that's nice. Then we prioritize it higher than this list of features, which is exactly what we want, right? Then these metrics are relatively easy to collect, provided there are certain best practices followed, like time logged on tickets, work log and initial tracking system and the down times that are recorded and so on, which is anyway something that I would strongly suggest that you do. And then among cons, there will be actually just one, but a big one. These metrics are really very generic. So it is hard to connect them with specific code modules, classes and et cetera, to get what exactly you need to fix, which means that although they're connected directly to the business impact, it's harder to give a promise to the business stakeholders that you will deliver this business impact, because you need to go into some sort of like investigation and guessing what is exactly driving this business impact, this negative business impact that you have now, that you can then fix. And this is where our third bucket, consisting of one beautiful metric called take that interest, comes into the picture. Now this is a concept that was described by Martin Fowler, the author of the book Refactoring, in the year 2008. In the article I linked here, but the concept in a nutshell goes like this. It is important to get onto the ticket level when you want to understand how much Technitiz is actually impacting you. Because for every ticket, we know the effort it took, provided we were logging this, and we should.
5. Calculating Techdebt Interest
And with the team that has worked on this ticket, we can estimate the effort it would have taken without Techdebt. The difference between the two is Techdebt interest. We estimate Techdebt interest for all tickets, considering bugs and new functionality. This allows us to calculate the actual overhead and business impact, enabling ROI calculation. To collect Techdebt interest, ensure time logging and add a field estimation to every ticket. Repeat the exercise regularly to have data at your fingertips. Techdebt interest is directly connected to business impact and eases prioritization within the Techdebt bucket.
And with the team that has worked on this ticket, we can actually also estimate how much effort it would have taken without Techdebt. And although it is an imaginary situation, there would never be no Techdebt, there can only be very little Techdebt, but we can do this mental exercise and assume it would be taking this much time if we weren't slowed down by Techdebt. And the difference between these two is Techdebt interest.
Now, there we go over all tickets that we can remember and estimate it. So some tickets will have more Techdebt interest, some will have less, some will consist of Techdebt interest because those will be some bugs or putting out fires like outages and so on that wouldn't be there if it wasn't for Techdebt. Some will have no Techdebt interest because it would be some new functionality that isn't impacted by Techdebt at all, like a separate module, something that we are starting anew.
And then, finally, we have the Techdebt interest estimated for everything. And we also put labels or tags, whatever you call that, that represent modules or components that all the tickets are related to. And with this info, we can get a table of this sort, where we would have our code segments and the effort, and consequently the tagged interest, the average tagged interest associated with them, and for every segment of the code, we'll be able to calculate the actual overhead in effort, like person days or whatever the unit of effort you're using that it took. Which is then great to detect the actual hotspots, like how much we were affected by this tag And the business impact that it will have whenever we fix this tagged net actually. And that is exactly what we want, right? Because this is how we can calculate the ROI.
Then the ROI is the metric of them all for any changes that we want to prioritize against business features, because those will have ROI as well. And then we can compare apples with apples. Let's say we want to fix the component authentication and we'll take five person days, then we take 14 person days of the overhead due to tech debt in the recent, I don't know, six to 12 months divided by five and get what, 2.8. So that's actually a relatively nice ratio.
Now just a couple of words about how to collect tech that interest, because it may be cumbersome and it's good to have like a working quite optimal algorithm. So first of all, you make sure your team logs time spent on tickets. Then you add a field estimation with no tech debt in your issue tracker to every issue AKA ticket. And then with the team, you sit together and feel the field for the past tickets within a reasonable timeframe, being the timeframe that they still can remember about, it's like they still can remember the details of the tickets for them. Like, I don't know, for instance, beyond six months into the past, I would hardly expect anyone to remember anything in detail. But three months into the past, maybe. So define it for yourself. And then the most important thing is that from that point on, you repeat the exercise For instance, either every retrospective, or even better, by adding where the issue trackers support, this additional field to the issue close dialog. So that every engineer, when closing the ticket, just fills this field, and yeah, then you have the data at your fingertips at all times.
So, quickly about the pros and cons of Techdebt interest. First of all, it's also directly connected to the business impact, which as we know is great. Then one of the biggest ones, it ignores sleep in Techdebt that isn't actually causing any trouble or hasn't been causing any trouble in the past months that we remember about or that we have data about. It also establishes a really clear relation between modules, components or other segments of your code, and the Techdebt business impact in them. Which then in turn eases Techdebt prioritization within the bucket of Techdebt. So you know clearly what you need to start working on in that bucket.
6. Using Tech Debt Metrics
Among the cons, it's relatively cumbersome to collect and opinionated. The metric is based on the work log, blind to unworked parts. Tech debt interest is crucial for measuring the impact of tech debt. It provides an educated prediction and eases cross prioritization. Heuristic metrics give an overall idea of code base health and are useful for estimating potential tech debt impact. Second tier metrics show the impact of tech debt elimination efforts. They are indispensable for tracking team and software health.
Among the cons, you will have that it's relatively cumbersome to collect and it is indeed so, but if you ask me it's totally worth it and it's really not so much hustle. That it is opinionated, also questionably a disadvantage because I mean it is a disadvantage, but you will have opinions involved in the interpretation of all the metrics that I mentioned previously. So it is probably just a little bit more opinionated as every estimation than those interpretations of the previous metrics or sometimes maybe less.
And then the biggest one is that this metric is based on the work log, which means that it's completely blind to the possible that haven't been worked on because we don't have data. All those estimations, how much it would have taken if there wasn't any tech debt, and that means that we can't really have predictions about the future of those parts of the code and the tech debt's interest that will be imposed on them. And, yeah, so we only can know these things about the parts of the code we have worked on. That would be it about tech debt interest, and now we can actually conclude how to use all those metrics, because first of all, you have seen that there's no ultimate metric that you can just take and measure everything with. So, there has to be some sort of a synergy of them all. And this is the synergy I would suggest.
So, at the core of it, there would be tech debt interest because it is great to measure the impact of both types of tech debt, continuous and maintenance tech debt, because, yeah, you can slice and dice the data the way you like, and, yeah, you will always have some numbers where you have the data recorded. It gives a very good educated prediction, tech debt interest, on the tech debt impact for the parts of the code you have tech debt interest data for. Yeah, so, the ones you have worked on because it's a much better prediction. So, an educated prediction is a much better prediction than just an estimation. Here you have some stats and more granular estimation, which is almost always better. And then finally I already mentioned the ROI. And this is the only metric that will give the ROI to you. And then it eases cross prioritization between tech debt changes and business features greatly. Then we add heuristic metrics into the picture. They give us a good overall idea of the code base health, regardless of whether we work on it or not. They are then useful to estimate the potential tech debt impact for the parts of the code that we're only planning to work on. So we don't have tech debt interest data for. Then we can compare them with the ones that we have tech debt interest data for and then proportionally assume that the tech debt interest will be kind of this or that. And then these metrics are definitely indispensable for automated quality gains that likely many of you have seen previously, like all those things that check your merge requests or pull requests and say that you can't merge them because something, like cyclomatic complexity is wrong and then the code is duplicated and so on and so forth. And this is great because then we don't increase the amount of tech debts and we automatically check for this. And finally, second tier metrics. They're great because they show the impact of our tech debt elimination efforts via trends, like the ad hoc share of efforts, the share for ad hoc efforts goes down or like, I don't know, the stability goes up and so on. And they're generally indispensable for tracking the overall team and software health and that's why I suggest that every team tracks them. Yeah, and that would be everything I have to say about measuring tech debt and interpreting the results. Here's a slide with the sources, whenever you get hold of the presentation. And with that, thank you very much.