Where AI actually pays back - and where it doesn't

Most of the AI we've shipped in 2025 didn't look like AI. It looked like a quiet inbox that got answered overnight. A weekly report that produced itself. A spreadsheet that stopped drifting. The pattern is the same in every business - and it almost never matches the demo on the homepage.

§ 01The pattern under the hype.

There are two kinds of AI projects in the market right now: the ones that ship a feature, and the ones that ship a result. The first kind ends in a press release. The second kind ends in a P&L line.

We've been running engagements across allied health, auto services, professional services, and SaaS - and the result-shaped ones are, almost without exception, doing one of four boring things.

Reading something a human used to read.
Writing something a human used to write.
Routing something a human used to route.
Summarising something a human had to scroll through.

That's it. There is no fifth thing yet. The fifth thing is being demoed, but it isn't paying back.

Quick note

If you can't describe what your AI project does in one of those four verbs, you're shipping a feature - not a result.

§ 02What “pays back” actually means.

“Pays back” is doing a lot of work in this essay, so let's pin it down. We measure payback the same way we measure any operational change:

Hours saved per week, measured before and after.
Latency removed from the customer-visible loop.
Error rate compared to the human-only baseline.
Leadership-visible signal: would the CEO still notice if you turned it off?

The last one is the one most consultants skip - and the one most owners feel first. If you can switch the system off without anyone noticing for a week, it isn't paying back. It's running.

A project that can be quietly turned off for a week without anyone noticing isn't shipping a result.It's shipping a feature.

- Field notes · № 042

§ 03The four loops worth automating first.

If you only deploy one AI workload this quarter, deploy it inside one of these four loops. They are ordered by reliability of payback, not size of opportunity.

1. Inbound triage

Every business has a bucket of inbound - leads, support tickets, claims, intake forms - that a human has to read, classify, and route. This is the single most reliable place for AI to pay back in 2025. Read, classify, route. Three verbs. Confidence-score the borderline cases and escalate them. Done.

2. Document extraction

If your team is retyping numbers from PDFs, screenshots, or scanned invoices into a spreadsheet or an ERP, you are running a payback waiting to happen. The technology was unreliable in 2022, marginal in 2023, and is now boringly reliable in 2025 - if you build it with the right human checkpoint.

3. Outbound research & enrichment

“Find me 200 prospects that match this profile, with the right contact, the right context, and a personalised opening.” Three years ago this was a $40K manual SDR project. Today it's a workflow that runs overnight and gets reviewed in the morning. The leverage is real, and the ceiling is high.

4. Weekly summarisation

The quiet, unglamorous win. A pipeline that ingests your messy weekly signal - Slack, ops dashboards, CRM, support tickets - and produces a clean, opinionated Monday-morning brief for the leadership team. Not a chart. A short essay, written for whoever runs the business.

# A workflow, not magic.
inbound  → classify(confidence) → route(team) || escalate(human)
document → extract(schema)     → validate(checkpoint) → push(ERP)
outbound → research(ICP)       → personalise(draft)   → human_review()
summary  → ingest(sources)     → opinionate(template) → deliver(email)

§ 04Where AI doesn't pay back yet.

Equal honesty here. There are categories where the spend is still ahead of the result. We've watched clients get burned in all of them.

Customer-facing chatbots on marketing sites. The conversion lift is marginal at best, the brand risk is real, and the boring contact form usually still wins on every measured metric.
“Decision-making” agents at the centre of your business. Anywhere the cost of a single wrong decision is high (medical, legal, financial), the right move is human-in-the-loop with AI as the assistant - not the agent.
Knowledge bases that need company-specific reasoning. Retrieval works; reasoning over messy proprietary data is still expensive to do well, and most “internal AI” projects we've audited haven't measured payback at all.
Code that writes code in production-critical systems. Generated scaffolding is excellent. Generated production logic with no senior review is, today, a known footgun.

§ 05A 30-minute test you can run today.

You don't need a strategy session for this. You need thirty minutes and a notebook. Here's the test we run on every engagement:

Pick a workflow your team complains about. Not the strategic one. The annoying one.
Time it. Honestly. How many human hours per week, measured over the last four weeks?
Describe it in one of the four verbs from § 01. If you can't, stop - that's a signal.
Sketch the trigger, the input, and the human checkpoint. (If there's no human checkpoint, you aren't ready.)
Estimate the smallest possible v1. If it's bigger than four weeks of work, scope it down.

If you finish that exercise and you're still excited about the project, you've found something worth building. If you finish it and the workflow has shrunk to nothing - congratulations, the project just paid back at zero cost.

Diagnostic short-circuit

About 1-in-3 of the AI projects we audit during our diagnostic die in this 30-minute test. The other 2-in-3 are clearly worth building. The diagnostic just shortens the path.

§ 06A note to CEOs.

You will be pitched a hundred AI features in the next twelve months. Maybe ten of them are worth shipping for your business. Two or three of those are worth shipping this quarter. The other ninety-seven are noise.

The way to tell the difference isn't more demos. It's the owner's lens - what would change on Monday morning if this worked? If you can't answer that in one sentence, the answer is “nothing measurable” - and you've found a feature, not a result.

If a sentence-answer comes out clean, you've found something worth ten more minutes. And probably four to eight weeks of focused build after that.

That's where AI pays back. Quietly, in the unglamorous loops, on a Monday morning.

Author

Shammika M

Founder & CEO. 25+ years of building real software for real businesses. Writes The CEO's Brief - one essay a week for non-technical CEOs.

Book a strategy call →

Where AI actuallypays back - and where it doesn't.

§ 01The pattern under the hype.

§ 02What “pays back” actually means.

§ 03The four loops worth automating first.

1. Inbound triage

2. Document extraction

3. Outbound research & enrichment

4. Weekly summarisation

§ 04Where AI doesn't pay back yet.

§ 05A 30-minute test you can run today.

§ 06A note to CEOs.

Shammika M

Read next.

Talk to the author.

Where AI actuallypays back - and where it doesn't.

§ 01The pattern under the hype.

§ 02What “pays back” actually means.

§ 03The four loops worth automating first.

1. Inbound triage

2. Document extraction

3. Outbound research & enrichment

4. Weekly summarisation

§ 04Where AI doesn't pay back yet.

§ 05A 30-minute test you can run today.

§ 06A note to CEOs.

Shammika M

Read next.

Agents are just workflows.

The spreadsheet is the spec.

Stop putting AI on your homepage.

Talk to the author.