Contents

Useful (and Useless) Metrics to Understand Whether Your Team Is Improving

Useful (and Useless) Metrics to Understand Whether Your Team Is Improving

(or: how to measure without becoming a slave to numbers)

/en/team_metrics/img.png

There was a time when I truly believed metrics were the solution.

The team was growing, the backlog was full, and we constantly felt like we were chasing something.
As it often happens, the question was always the same:

“Are we improving, or are we just running faster?”

And as it often happens… when you can’t answer, you start measuring.

At first it all looks great: dashboards, charts, month-over-month trends, reports that scream “serious company.”
Then the first strange thing happens:

  • the number of closed tickets goes up
  • velocity seems to improve
  • releases increase

…and yet customers start complaining more.
Regressions show up in production.
Support becomes tense.
The team looks more and more “tired.”

That’s when I realized something very simple (and slightly annoying):

Metrics don’t lie.
But they can still tell you the wrong story.


Metrics are not truth: they’re a lens

A metric is not a verdict.
It’s a perspective.

It helps you see something more clearly… while making you blind to something else.
And that’s why many organizations end up hurting themselves with numbers:
not because they measure too much, but because they believe too much.

If you treat a metric as “reality,” sooner or later you’ll hit a wall.

The invisible trap: measuring changes behavior

Here’s the part many people ignore:
a metric doesn’t just measure.

A metric is also an incentive.

The moment you start tracking something, you’re telling the team:

“This thing matters.”

And the team—rightfully—will start optimizing for it.

That’s normal. It’s human.
And it’s also why you get effects like:

  • measure “tickets closed” → you get lots of artificially split micro-tickets
  • measure “velocity” → you get shortcuts and more debt
  • measure “commits” → you get meaningless commits
  • measure “hours” → you get more hours, not more value

Not because the team wants to trick you.
Because people are smart: they understand what you want even before you say it explicitly.

That’s where a famous (and brutal) rule applies:

When a measure becomes a target, it stops being a good measure.


My first mistake: chasing the “perfect dashboard”

At some point we had “everything”:

  • Jira charts
  • Git reports
  • CI numbers
  • percentages everywhere

And yet… we still couldn’t make better decisions.

It felt like staring at engine parameters while the car was dying,
and assuming the problem was “not enough data.”

In reality the problem was different:

We had metrics.
We didn’t have diagnosis.

It’s the difference between:

  • looking at a thermometer
  • understanding why you have a fever

The point isn’t “measuring”: it’s reading in context

Metrics without context are like an X-ray without a doctor:
it might help, but it can also scare you for no reason.

Practical example: “we’re going faster”

Scenario:

  • tickets closed: +30%
  • “apparent” velocity: +20%
  • average task time goes down

Looks like improvement, right?

Then you look at reality:

  • production incidents increased
  • regressions are more frequent
  • support is on alert
  • people are working “just a bit too much” all the time

What you’re observing isn’t performance.
It’s pressure.

And pressure in engineering always produces the same effect:

  • short term: it makes you look fast
  • mid term: it makes you fragile
  • long term: it breaks you

The anti-false-signal trick: segment the work

One of the most common mistakes is to measure everything in the same bucket.

“Average cycle time” for what?

  • features?
  • bugfixes?
  • refactors?
  • incidents?

Those are different universes.

Mixing them together creates a statistic that looks useful…
but is really just an average between apples and trucks.

So if you want to avoid lying to yourself without realizing it:

Always segment metrics by type of work.


Healthy metrics measure the system, not people

If I had to sum up the “senior CTO” principle in one sentence:

Useful metrics don’t tell you “who is slow.”
They tell you “where the system creates waiting.”

The metrics that truly work are almost always flow and stability metrics.

Why?
Because most problems aren’t “productivity.”
They’re the system generating friction:

  • too many dependencies
  • massive queues (queue time)
  • slow reviews
  • unstable scope
  • priorities changing every two days
  • painful deployments
  • fragile CI
  • repetitive incidents

And you don’t see any of that through “how many tickets you closed.”

You see it by observing the flow.


Real improvement has a very specific sound

A team that truly improves isn’t the one that “does more.”
It’s the one that:

  • finishes work with less waiting
  • releases with less fear
  • recovers faster when things go wrong
  • accumulates less debt
  • depends less on heroes
  • stays sustainable over time

That’s the difference between running and getting better.


The tyranny of metrics: when everything becomes control

This is the most delicate part, because this is where trust breaks.

It happens when metrics are used like this:

  • as a weapon in a meeting
  • to compare people
  • as individual KPIs
  • as “motivation” (read: pressure)

From that moment on, the game is over.

Because the moment a metric becomes judgment:

  • data becomes fake
  • the team becomes defensive
  • conversations become political

And the paradox is:

The more you use metrics to control, the less reliable they become.

The healthy alternative is the opposite:

Metrics as tools for coaching and continuous improvement.

Metrics should generate questions, not verdicts:

  • “Where are we losing time?”
  • “Where does the flow get stuck?”
  • “What makes a release risky?”
  • “What keeps forcing us into hotfixes?”
  • “What kind of work drains us the most?”

A mini playbook to introduce metrics without causing harm

If you want a pragmatic approach (and zero dashboard anxiety):

  1. Start from a real question
    “Why are we releasing slowly?”
    “Why do bugs show up so late?”

  2. Pick 2–3 metrics max
    Few, observable, with immediate impact

  3. Look at them as trends, not targets
    The absolute number matters little. The direction matters.

  4. Discuss them with the team
    Metrics are not a management report.
    They’re a shared language for improvement.

  5. If they don’t help you decide: delete them
    A useless metric is noise and distraction.


Recommended metrics (the ones I truly recommend)

Below is a list grouped by goal.
You don’t need all of them—pick the ones that answer your questions.


1) Flow / Delivery (are we delivering better?)

Core metrics

  • Lead Time (from idea to production)
  • Cycle Time (from work started to done)
  • WIP (Work in Progress) (how many things are open at the same time)
  • Queue Time (time waiting between steps)
  • Throughput (how many items are actually completed in a time window)

How to interpret them

  • high cycle time often = large tasks, slow reviews, dependencies
  • high queue time often = process waiting (handoffs, unstable priorities)
  • high WIP almost always = multitasking and context switching

2) Quality / Stability (are we releasing safely?)

Core metrics

  • Change Failure Rate (how many releases cause incidents/rollbacks)
  • MTTR (mean time to recovery)
  • Incident Frequency (how often incidents happen)
  • Rework Rate (how many things come back / get redone)

How to interpret them

  • if you release more often and failure rate doesn’t rise → you’re maturing
  • MTTR is one of the “truest” metrics: it measures resilience, not just quality

3) CI/CD (is the delivery system healthy?)

Core metrics

  • Build Time (pipeline duration)
  • Pipeline Failure Rate (how often CI fails)
  • Time to Green (average time to fix a broken pipeline)
  • Time Merge → Deploy (time between code merge and production)

How to interpret them

  • slow pipelines = bigger batches = more risk
  • fragile pipelines = teams lose trust in the process

4) Sustainability (are we improving without burning the team?)

Metrics (often qualitative)

  • on-call load
  • interruptions / context switching (even if estimated)
  • onboarding time
  • dependency on single people (bus factor)
  • “unplanned work” vs planned work

How to interpret them

  • if urgency dominates, the system is not under control
  • if a few people always “save the day,” you’re building a fragile organization

5) Customer impact (are we building things that matter?)

There’s no universal set here, but you can track:

  • customer-reported bugs / support tickets
  • feature adoption (when measurable)
  • response time to real problems
  • volume of “unplanned work” caused by the product

Guiding principle

Output is not value.
Value is what remains after the work is done and the user actually uses it.


Bonus: metrics I would almost always avoid

  • story points completed
  • commit count
  • lines of code
  • hours worked
  • individual “productivity”
  • velocity as a target

Not because they’re always wrong, but because they almost always:

  • create the wrong behavior
  • turn into theater
  • give you false confidence

The right way to use them

Metrics are not meant to control people.
They’re meant to observe a complex system from multiple angles.

When you use them well, something great happens: you stop chasing blame, and you start improving leverage points.

And at the end of the day, the only thing that matters is this:

A team is improving when it releases with more confidence,
recovers faster when things go wrong,
and builds value without burning out.