How to Form an Effective Technical Debt Strategy

Tue 29 October 2019


NOTE: To skip the next 500 words recounting the anecdote that inspired this article and jump straight to how to form an effective technical debt strategy, click here.


I had just git pulled the changes that my team lead had made to the Java API from the night before. Our team had six people and we had been working against a deadline for the past few weeks. Our deliverable was an orchestrated set of Docker services on AWS. A new class I hadn't seen before jumped out at me. I felt a growing sense of frustration as I inspected it:

public class FooBarProvider {
  FooDao fooDao = new FooDao();

  public String getFooBar(Integer fooId) {
    FooRecord fooRecord = fooDao.getFoo(fooId);

    return fooRecord.bar;
  }
}

For context, here's what the foo table looked like:

TABLE foo (
  id  INT  PRIMARY KEY,
  bar TEXT NOT NULL,
  baz INT  NOT NULL,
  qux INT  NOT NULL,

  ...
);

I was biting back criticisms. Here are some thoughts that came to mind:

  • "Why create a class that exposes a single column? What if we need to access baz or quz in the future? Why not create a FooModel that fully reflects the underlying table?"
  • "We already make use of the Model pattern in our codebase. A Provider and a Model heavily overlap in terms of functionality. By introducing the former we are diluting the meaning of the latter and creating confusion."
  • "The solution is not elegant."
  • "This doesn't make any sense. It's illogical and messy. I can do it better."
  • "🤨😱🤯"

Needless to say I had some strong opinions about the implementation. Here's what I wished we had written instead:

public class FooModel {
  FooDao fooDao = new FooDao();
  FooRecord fooRecord;

  public FooModel(Integer fooId) {
    fooRecord = fooDao.getFoo(fooId);
  }

  public String getBar() {
    return fooRecord.bar;
  }

  public Integer getBaz() {
    return fooRecord.baz;
  }

  public Integer getQux() {
    return fooRecord.qux;
  }

  public void setBar(String bar) {
    fooRecord.bar = bar;
  }

  public void setBaz(Integer baz) {
    fooRecord.baz = baz;
  }

  public void setQux(Integer qux) {
    fooRecord.qux = qux;
  }

  public String computeSomeBusinessLogic() {
    return fooRecord.bar + String.valueOf(fooRecord.baz);
  }

  public void save() {
    fooDao.upsert(fooRecord);
  }
}

I brought up my concerns with my lead during code review and they were addressed as follows:


"Why not create a FooModel that fully reflects the underlying table?"

Lead: "What is the likelihood that we'll need to access baz and qux in the future?"
Me: "It's small."
Lead: "So how about we implement that functionality only if it becomes needed?"
Me: "But that means we'll be accepting an imperfect solution in the meantime!"
Lead: "What makes it imperfect?"
Me: "The fact that it doesn't follow the same Model pattern that all the other tables do."
Lead: "So what do you propose?"
Me: "That we implement it as a FooModel with all the associated CRUD methods."
Lead: "And the liklihood of needing getBaz(), getQux(), setBar(), setBaz(), setQux(), computeSomeBusinessLogic(), and save() is low, yes?"
Me: "...yeah."
Lead: "And it would probably take you longer to implement a FooModel than it took me to implement FooBarProvider correct?"
Me: "..."


"OK, but this dilutes the meaning of a Model"

Lead: "To whom?"
Me: "To the people that are working on the API."
Lead: "Which are?"
Me: "Well, currently? Just you and me."
Lead: "I personally understand the distinction: for our purposes a Provider abstracts a single column in a table whereas a Model abstracts the entire table. And you understand the differences as well, so.."
Me: "Yes but if we have a large team that distinction might be lost on some people and lead to them writing incorrect code."
Lead: "At how many members would you estimate this becoming a concern?""
Me: "I haven't really thought about it, maybe 20?"
Lead: "There are currently six of us. Do you think we're on track to gaining another fourteen developers in the near future?"
Me: "(internal sigh) no."
Lead: "So let's worry about that only if we start getting big."


"Alright fine but this solution just isn't elegant!"

Lead: "Ah now this is your real concern! I completely agree!"
Me: "Huh??"
Lead: "To us developers it's not elegant. But to the business it is."
Me: "Ugh, I was worried you were gonna say that.."
Lead: "It may not be perfect, but it's also simple and it works right now—not a week from now. It allows us to get to market faster and frees us up to work on more interesting problems. We should allow growing pains to dictate when it's time to refactor rather than trying to plan for every possible outcome upfront."


Forming an Effective Technical Debt Strategy

Although I was slow to listen at first, these conversations (of which there were many more) had a lasting effect on me. Over time I grouped them under the larger topic of Technical Debt and how to go about discovering an effective Technical Debt Strategy for a team. They've led me to think about how to work with TD deliberately and in an intelligent manner. Below are my current thoughts on the subject.

Junior developers may find themselves too busy with grokking the software engineering landscape to fully make use of this article. Similarly, senior developers likely already have their own internal model for intuitively dealing with Technical Debt. These thoughts are meant for those in between. The intent is not that this approach be followed verbatim; rather it's a way to stimulate thought on the subject and accelerate learning.


Start the Discussion Early

In the ideal case this discussion occurs early on in the life of a programming team. The larger the group the sooner this should happen. Even better is if the company has a policy that encourages all teams of N members or more to have such a strategy.


Educate Team Members on Technical Debt

It's important to outline the pros and cons of Technical Debt in case some team members are not familiar with the concept. I'd personally emphasize that, much like regular debt, it can provide a benefit if used with intention, such as:

  • speeding up time-to-market,
  • unblocking a downstream team to start working on new features,
  • allowing your team to prototype features more quickly and avoid over-investing in a bad solution, or
  • gain favour with the business side of the company by displaying an ability to prioritize results over "pretty code".

It comes with its own downsides, especially if you stumble into the debt by accident:

  • If forgotten it can have deleterious, cascading downstream effects throughout the rest of the system.
  • If the debt is poorly-documented or opaque in its implementation its authors become more "sticky" as they spend more time maintaining and fixing what only they know how to work with. This can reduce inter- and intra-team mobility and lead to less skill and knowledge sharing across the company.
  • It can reduce trust between the team and its stakeholders as unforeseen bugs and delays start to pop up, requiring unexpected deviations from the agreed-upon plan.


Find Consensus on What Technical Debt Means to Your Team

Technical Debt is an amorphous concept. You want to arrive at a consensus of what the term means to the team before going further. I personally like Uncle Bob's take stating that "A mess is not a technical debt". He argues that messy code is never justified and doesn't count as TD. My personal definition of Technical Debt is a piece of code which:

  • is production-level in all ways except for one or two. As a simplified example:
    • It's logically sound and well-documented but lacks unit tests.
    • It's well-tested and well-documented but the logic is suspect.
    • It's NOT, however, logically-suspect, and also lacking unit tests and documentation.
  • poses an above-average risk to the system,
  • has other, significantly less-risky, implementations available,
  • was made for a short-term gain and,
  • needs to be paid off as soon as possible.

TD implies the ability to produce a significantly less-risky solution given enough time. If you're working with spaghetti code the chances of doing so in a timely manner are small. In this case the code in question should be treated more like a quarantine site.


Frame Technical Debt as a Useful Thing, Because It Can Be!

Developers should be encouraged to talk about, and take on, Technical Debt if:

  • they feel it would help them meet the team's goals that they otherwise would not be able to meet and,
  • they can justify the Technical Debt.

If a stigma exists against sub-optimal solutions then team members become prone to sweeping their cut-corners under the rug and potentially even forgetting about them—leaving them to fester and cause problems where least-expected. The important thing is that a clear boundary be drawn around the TD so that it can be dealt with later.

Simultaneously it's important to emphasize to the team that once TD is taken on it becomes the entire team's responsibility and not just the original author's.


Where in the System Is It More/Less Acceptable to Accrue Technical Debt?

It's also beneficial to discuss what parts of the system you are more comfortable accruing debt in. Obviously when you're starting a new project you don't know what the individual components will be but certain, common, implementation patterns are likely to surface.

I generally think that the further away you are from your "Source of Truth"—the database, for example—the more acceptable it is to take on Technical Debt. A database schema should be less prone to cutting corners than the UI layer. Are there any places or abstraction layers where you refuse to accept any TD?

Some areas to consider in order to start the discussion:

  • middleware layers (auth, serialization/deserialization),
  • heavy/low compute processes,
  • build tools/pipelines, and
  • async/sync processes.


Assign Each Technical Debt Item a Priority

All solutions labeled as Technical Debt should be assigned a priority by no later than the code review phase. Another way of putting this is you should make an attempt to quantify the urgency with which the debt should be dealt with as soon as possible. Estimates of most kind in the programing domain are notoriously hard to make—this is more for the purpose of ranking the list of debt in order to prioritize what should be fixed first.

Also, "fixing" doesn't necessarily mean immediately jumping to the "ideal" solution. It can be an incremental process that slowly removes and manages risk over time.


Track It Publicly

Technical Debt should be explicitly labeled as such—ideally in a central place so others don't have to go hunting for it. How you group it is up to you: by project, team, developer, etc. Something as simple as a Google Doc or a technical-debt.md file is enough. If you use more complicated programs like Jira (shudder) that's a great place to capture debt as well. This is the place where you would record each item's associated urgency level and rank the resulting list.


Define Your Technical Debt Budget

This depends on the stage and importance of your product. If you're strictly in the prototyping phase your budget is going to be quite high, perhaps even infinite if you often discard your prototypes at the end of an exploration. If this is an application that serves many users and is simultaneously receiving active development effort then that budget is going to be smaller.

The budget can be quantified and reflected in different ways:

  • number of allocated developer-hours per sprint,
  • number of individual debts to pay off per sprint,
  • number of days with no user-facing changes per quarter, or
  • a Service Level Agreement for your product,
  • etc.

The rate at which you accrue Technical Debt is also important. Your ability to take on TD is directly correlated with how much risk your team is willing to carry at any given point in time. The risk for a single Technical Debt item starts at an above average level and naturally decreases over time to settle at a point slightly above a normal, production-level feature. This generally leads to two strategies:

  1. Run a low technical debt ceiling—allow for slow accrual that is paid off over a long period of time. As each single item naturally de-risks, you become comfortable with taking on more.
  2. Run a high technical debt ceiling—carry little TD but allow for the ability to take on large bursts of it for short periods of time. You then spend the next N-sprints paying it back where N is 0.5+


All Technical Debt Should Receive Extra Testing Effort

When it becomes apparent to the developer that the feature they're working on is, in fact, Technical Debt the policy should be that extra effort be put towards writing tests for the code in question. I should emphasize that this does not necessarily equate to more tests, just more thought and rigor. This provides two benefits:

  1. it allows you to obtain a greater degree of confidence in the code, and
  2. it makes it easier to swap in a better solution in the future as the code's contract is (hopefully) better captured by the tests.

If your Technical Debt budget it close to infinite (ie. this is a prototype) then this point applies less.


You Will Deviate from Your Strategy and That's OK!

You and your team spend half a day eagerly discussing these strategies and maybe even commit to them for two or three sprints but then Real Life hits. You find that your team has deviated from your meticulously set out plan - you feel discouraged and consider scrapping the strategy altogether.

Perhaps you took on more TD than you had planned for and need to spend an extra week cleaning up the damage thus causing delays. Maybe a code-review revealed that a TD feature wasn't adequately tested and has more risk than your team is comfortable carrying. Maybe you had to quickly patch an issue in the Database layer with some spaghetti code in a middle-of-the-night, on-call coding session. This is all expected.

Deviations from your Technical Debt Strategy are normal and you shouldn't get discouraged when they happen. In fact it will likely take you some time before your team arrives at a sufficiently defined and accurate strategy that everyone is happy with. After the adoption of a TDS you should devote the next few sprints to adjusting it.


If People Aren't Thinking About It, It Doesn't Exist

Your team very likely has one or more communication times and/or places that serve the function of allowing everyone to sync up with one another. Slack, post-mortems, daily standups (shudder again), IRC, etc.

Once the strategy is sufficiently defined, I strongly advise you:

  • Condense it as much as possible while still retaining its essence.
  • Give it a writing treatment; preferably three—first draft, second draft, third draft. Minimize the friction of mentally-ingesting the document as much as possible.
  • Try to imagine yourself as new developer in the company. Does this document equip them with the information necessary to deal with TD as your team has agreed upon? Does it answer all their questions or do they need to seek out others to get fully onboarded?
  • Test it by asking people from other departments to read it and see if anything confuses them.
  • Indicate on the document that:
    • it is a living document,
    • it is subject to change, and
    • change and discussion are encouraged.
  • Indicate which team is in charge of this Technical Debt Strategy.

Try and make it so that this document lives as close to this communal time/place as possible and point to it often (not literally), reference it, encourage others to point to it often (again, not literally) and reference it as well.


Think About Your Technical Debt Strategy's Lifetime Hooks

Depending on how detailed you want to make this strategy you will be asking your developers to invest a not-insignificant amount of their time and energy into adopting it. It's good to make sure that their effort doesn't go to waste. You want to capture as much of the gained value as possible. This value can be lost during:

  • A team member leaving
  • A team dissolving
  • A team lead leaving
  • A project ending
  • The company ending

Being a programmer, I like to envision the Technical Debt Strategy as an object that can subscribe to critical events throughout your team's lifetime:

onNewTeam()
onTeamDissolving()

onMemberJoin()
onMemberLeave()

onTeamLeadChange()

onSprintEnd()

onCompanyEnd()

onStrategyUpdate()
onStrategyEnd()

onConflictBetweenTeamMembers()

# Etc.

It may be overkill but I personally prefer to entertain as many of the above possibilities (oftentimes inevitabilities) as possible and consider:

  • Which stakeholders does each event impact?
  • How do you preserve as much accrued knowledge and experience that was gained as a result of creating and working with this strategy in each event. How can you pass it on to others?
  • How do you ensure the strategy remains active—and actively-followed—while still being as lightweight as possible?

The goal of all of this is to future-proof the strategy as much as possible and maximize its effective lifetime. And finally, it's a good idea to consider what to do if the idea of a Technical Debt strategy doesn't work with the team/company/organization. Is there a backup approach? How long should you evaluate a strategy before deciding if it's worth keeping?


Adapt, Discuss, Question, Reach Out!

These ideas are just my personal take on how to deal with Technical Debt in an effective manner and at scale—some people will agree with them, some won't, many will have radically different approaches which is great! In all cases reach out to me with your experiences on working with Technical Debt—I'd love to hear from you and your strategies!


P.S. I've created a Technical Debt Strategy template that you can use to kickstart a discussion with your team around how to work with, and manage, Technical Debt:

# TEAM_NAME Technical Debt Strategy
NOTE: This is a living document. Feedback and discussion is not only welcome but encouraged!  

Version: 1.2.3  
Date:  
Lead:  
Members:  

### Our Criteria for Technical Debt
Our definition of Technical Debt is a piece of code which:

- is production-level in all ways except for one or two,
- poses an above-average risk to the system,
- has other, significantly less-risky, implementations available,
- was made for a short-term gain, and
- needs to be paid off as soon as possible.

### Where in Our System Is It Acceptable to Accrue Technical Debt?
(ex. the UI layer)  
(ex. serialization/deserialization middleware)  
(ex. async processes)  

### Where in Our System Is It Not Acceptable to Accrue Technical Debt?
(ex. the Data layer)  
(ex. the auth middleware)  
(ex. the build pipeline)  

### Technical Debt Severity Grading Scheme
(ex. low, medium, high)  
(ex. 8 developer hours to fix, 40 developer hours to fix, 120 developer hours to fix)  

### Where Are We Tracking Our Technical Debt?
(ex. [Jira](https://link-to-where-you-track-it.com "Our Technical Debt List"))  
(ex. [Slack](https://link-to-where-you-track-it.com "Our Technical Debt List"))  
(ex. Post-Its on the board in room #42)  

### What Do We Do When We Spot New Technical Debt?

- Devote an extra N points/hours to writing tests.
- Assign the debt a severity (ex. low, medium, high).
- Add it to the tracker.
- Bring it up during the next standup.

### What Is Our Technical Debt Budget?
(ex. 10 developer-hours/sprint)  
(ex. 1 high-severity debt paid off per quarter)  
(ex. 1 low-severity debt paid off per sprint)  

### Do We Have a High or Low Technical Debt Ceiling?
(ex. low ceiling, slow accrual, paid off slowly)  
(ex. high ceiling, bursty accrual, paid off quickly)  

### Lifetime Events for Our Technical Debt Strategy
What happens if/when the following events occur? Who do they impact? How do we preserve as much gained-knowledge as possible in all cases?

- A new team is formed.
- This team dissolves.
- A new member joins the team.
- A member leaves the team.
- The team lead changes.
- The end of a sprint occurs.
- The company dissolves.
- We want to change this strategy.
- We want to end this strategy.
- We deviate from this strategy.
- There is a conflict between team members regarding this strategy.