Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

From October through February, the Escalation team closed out an average of 33 issues per month.

...

When looking closer, we determined that the costliest part of triage was finding the root cause. In some cases, we even found the necessary fix for someone to pick up later. We went through our costly triage process because we wanted bugs to be as easy as possible for engineers to fix once they picked them up. The problem was that bugs weren’t being picked up fast enough to justify that investment.

As an experiment, we singled out this part of triage and deferred it to when we were ready to resolve the bug. Our prediction was that, with the same amount of effort, we’d be able to close out more tickets.

Graphically:

Image RemovedImage Added


Image RemovedImage Added

For the month of March, this prediction turned out to be true. 50% more tickets that we triaged were closed out, and the rates at which bugs were resolved and triaged converged:

...

Crucially, this improvement was only possible because of the Performance Review work that the team did. We had good intentions for our costly triage process: we wanted a ticket to be in an optimal state for a developer to pick it up and fix it. In retrospect, the problem was that tickets weren’t being picked up fast enough to justify that cost. But, since we weren’t measuring the work we were doing triaging, that burden stole upon us silently. Once we started measuring that work, we started seeing exactly how it affected our productivity, and, consequently, which modes of actions would be useful. Our improvement could only work because of the disparity in the rates of triage and resolution; had the data told us something different, we would have had to do something else.

...

For more information, I highly encourage you to peruse our dashboard that keeps track of these statistics.

...