Engineering Practices

Three Ways to Stop the Toy Box Overflowing

The first post in this pair was about the problem. This one is about what to do on Monday morning.

The argument was that every team accumulates technical debt the same way and almost none of them are honest about it, and the toy box on my living room floor turned out to be the same shape as the codebase.

Several of you got in touch after the essay to ask the obvious next question. The toy box analogy is pretty. What does the actual practice look like in a real codebase, on a real team, with real deadlines and a real backlog and a real demo on Friday?

Three frames, in a deliberate order. Make the debt visible. Make the trade-off honest. Pick a rule about what comes in. The order is the entire point of the post, so I am going to give it to you up front rather than make you read to the end for it.

You cannot start with a rule about inflow, because you do not yet know which inflow is the one hurting you. You cannot start with the trade-off grid, because you cannot bucket debt you have not named. The only place you can start is with visibility, and visibility is unglamorous and slow and produces no shippable artefact for the first three weeks, which is why most teams skip it. They go straight to the grid or the rule, the grid fills up with guesses, the rule solves last year's problem, and six months later the team concludes that "tech debt management does not work for us" and goes back to the override. It works. It just has to be done in the order it is done in, because the dependencies between the frames are real.

Frame One: See What You Actually Have

The most expensive form of technical debt is not the kind that is documented and ugly. It is the kind that lives entirely in someone's head. The bit nobody touches because Dave wrote it and Dave left. The override that made sense in 2021 for reasons no longer in anyone's working memory. The third-party library the build now depends on because of one method call nobody can find. None of this is on a board. None of it has a size or an owner. All of it is paying interest, paid by every engineer who opens the file and spends twenty minutes working out what they are looking at before they can start the work they were hired to do.

The fix is the single highest-leverage change you can make. Every time you add an override, a workaround, a special case, or a TODO that you both know is not really a TODO, you write down what you have just done and why, in the same commit, in a place a future engineer will actually find it. Not a Jira ticket called "tech debt: revisit", because that ticket will be closed in a backlog grooming session in eighteen months by someone who has never opened the file. A specific note about a specific thing, in the code, next to the thing it is about. The rule on our team is that the note answers three questions. What did I just do that I would not normally do. Why did I do it. What would have to be true for someone to undo it safely.

Three questions, one comment, every time. It feels like overhead for the first week. By the end of the first month it is faster than not doing it, because you stop having the same conversation in code review every Thursday.

The thing nobody mentions about this practice is that it changes how people feel about adding the override in the first place. When the act of taking a shortcut comes with a thirty-second writing task, the writing task makes you think for thirty seconds about whether the shortcut is the right call. A non-trivial percentage of the time, those thirty seconds are enough to make you do it properly instead. The comment is not just documentation. It is friction in the right place.

What AI Changes About All of This

Before I get to the second and third frames, I want to flag something that has shifted in the last year, because pretending it has not shifted would make the rest of this post less honest.

I almost did not write this section, because every blog post on the internet is now obliged to include an AI bit and the obligation usually shows. But the shift is real and the post would be less useful if I pretended otherwise.

I have been working with Claude and similar tools intensively, and the economics of this work have changed in ways that are still settling. The frames are unchanged. The cost of executing them has dropped by something close to an order of magnitude, and that shift is worth naming because it changes which excuses are still valid.

Visibility is the obvious one. The three-question comment used to be a discipline you had to fight for in code review and a cost you had to justify in sprint planning. It is now a thing a model will draft for you in twelve seconds, accurately, in the right place, every single time, if you ask. The friction that used to be the point is optional, which means the only remaining reason not to do it is that you have not asked. That is a much weaker excuse than "we do not have time", and it should be treated as such.

The inflow rule, which is the third frame, is the bit AI does not help with, and I want to flag that explicitly because the temptation to believe it does is strong. A model can enforce a rule once you have picked one. It cannot pick the rule for you, because picking the rule requires knowing what is hurting your team in a way that only your team knows. AI is a force multiplier on disciplines you already have. It is not a substitute for the judgement that creates the discipline in the first place.

Frame Two: Bucket It on Purpose

Once you can see the debt, the next mistake to avoid is the binary. Most engineering writing on technical debt presents two options: fix it now, or never fix it and lose everything. This framing is dishonest, because it ignores the option every team is actually using, which is fix some of it some of the time and consciously live with the rest. The reason we end up with toy boxes that will not close is not because anyone chose to have one. It is because nobody chose anything at all. Each individual decision was deferred, and the deferred decisions aggregated into a default.

The frame that works is a four-way grid. Every piece of debt you have made visible falls into one of four buckets. Fix now. Fix next. Accept and document. Accept and forget.

The first three are obvious. The fourth is the one that does the real work, and it is the one most teams refuse to put on the wall.

Accept and forget is the bucket for debt you have looked at, sized, considered, and decided to live with, for now and probably forever, because the cost of fixing it is genuinely higher than the cost of carrying it. It is the workaround you wrote three years ago for a vendor bug the vendor will not fix and you will not switch suppliers over. It is the special case in a parser handling something that no longer exists in the world but used to. It is the override on a report that runs once a quarter for a customer who will be off the platform inside two years.

This bucket will feel wrong the first time you put something in it. Put something in it anyway. The reason it feels wrong is the same reason most technical debt accumulates in the first place: engineers are trained to fix things, and naming something as deliberately unfixed feels like a small professional failure. It is not. It is the opposite. It is the act of taking responsibility for a decision that was already being made implicitly, and turning it into a budget line you can defend in front of anyone who asks.

Most of what ends up in this bucket got there the same way. A delivery date, a risk that was understood at the time, and someone in a meeting saying "we will fix it later" with the best of intentions. Everyone in the room knew, even then, that "later" was doing a lot of work in that sentence. Later is where fixes go to enter the never-never. Putting the thing in "accept and forget" is just the act of admitting that out loud, instead of letting it pretend to be a "fix next" that nobody is ever going to fix.

The McKinsey number from the first post earns its place here. Their research suggests technical debt amounts to somewhere between 20 and 40 percent of the value of an organisation's entire technology estate, and that around a third of CIOs they surveyed believe more than a fifth of their technology budget gets quietly diverted into wrestling with the old stuff instead of building the new.

If those numbers are even directionally true for your codebase, you are already living with debt.

The question is which debt you are living with on purpose and which debt is living with you by accident.

The difference is whether you have a grid on the wall or not.

The discipline is to revisit the grid. We do it once a quarter, in a meeting that takes ninety minutes and involves nobody more senior than the people who actually write the code. Things move between buckets. Items in "fix next" get demoted to "accept and document" because the world changed. Items in "accept and forget" get promoted to "fix now" because something else shifted and they suddenly matter. The grid is not a static document. It is a living conversation about where the team's attention is going and why, and the conversation is the thing, not the artefact.

Frame Three: Pick One Rule About the Front Door

The first two frames are about the debt that already exists. The third is about the debt you are about to create, and it is the one with the highest long-term leverage and the lowest short-term cost, which is why almost nobody does it.

The rule is simple. Every team picks one thing that cannot enter the codebase without a paydown plan attached. Just one.

Two things is no things.

The specific rule matters less than the act of having one. For some teams it is no new feature flag without an expiry date in the same pull request. For some it is no new database column without a documented owner and a retention policy. For some it is no new third-party dependency without a calendared review six months out. For one team I know, it is no new public API endpoint without a deprecation strategy written down before the endpoint ships, on the working assumption that every endpoint will eventually need to be retired and the time to plan the retirement is at the moment of birth, not the moment of death.

The rule has to be one thing because the moment you have a list of policies, the team starts treating them as a checklist to argue with rather than a discipline to internalise, and the discipline is what you are actually trying to build. One rule, applied without exception, for long enough that it stops feeling like a rule and starts feeling like the way things are done here. Then, and only then, you can consider adding a second.

Be honest about how often the rule will change. We are on our third rule in two years, and I do not consider the first two failures. The first rule taught us what the team would actually enforce. The second rule taught us what kind of debt was hurting us most. The third rule, the current one, is the one that has stuck, and it stuck because the first two existed. You cannot pick the right rule at the start, because at the start you do not know enough about your own team's failure modes to pick well. You can only pick a rule, watch what happens, and pick a better one in six months.

The mistake is not picking the wrong rule. The mistake is not picking any rule at all because you are waiting for the perfect one.

What This Does Not Solve

None of the three frames is a magic spell, and anyone selling you a magic spell on this topic is selling you something you should not buy.

What the frames do is turn an invisible, guilt-driven, intermittent problem into a visible, budget-driven, continuous one. They do not eliminate the debt. They make the debt legible, intentional, and discussable, which is the difference between a problem you are managing and a problem that is managing you. The toy box at home still does not close every week. The codebase still has fossils in it. The only honest measure of progress I trust any more is whether tomorrow's pile is slightly smaller than today's, more weeks than not.

If it is, you are winning. Slowly. Which is the only speed any of this ever actually moves.