AT Think

Overcoming AI false confidence: Governing what you can't see

The most dangerous number in AI adoption is not the error rate. It's the confidence rate.

Processing Content

A system that is right 98% of the time is also wrong 2% of the time. It presents both results with the same certainty. In a firm running hundreds of returns, that is not abstract math. It is a handful of deliverables, indistinguishable from the rest, carrying errors the system does not know it made. The firm and its partners own every one of them.

AI technology is new, so it feels like it should demand a new kind of governance. It doesn't. The profession already has the answer in house.

Managers have spent their careers governing non-deterministic systems, otherwise known as junior staff.

A first-year practitioner is probabilistic. They have good days and bad ones. They make mistakes in rough proportion to their experience, and confidence does not always track skill. They hand you work you did not watch them do. You evaluate it without having been in the room.

The profession's answer was never to demand perfect juniors. It was layered review, sign-off discipline, and escalation paths scaled to what the work touched. Behind it all sits a partner who owns the output no matter who produced it. That is not a patch for weak staff. It is the core governance structure of a professional firm, refined against a century of this exact problem.

That structure transfers straight to AI. The underlying problem is the same. You are governing output from a system that can be wrong in ways you cannot predict, at a volume you cannot check line by line. The review pyramid doesn't change. What changes is the first layer of the pyramid. A junior who isn't sure asks questions. AI emits no such signal. It does not hedge, flag or hesitate. The firm has to engineer the signal it used to get for free, then feed it into the review structure it already owns.

1. Make the system tell you when it isn't sure

The core failure is that AI severs confidence from correctness. The fix is to rebuild that link by design. You don't review a bare answer. You review an answer that carries a confidence score, and you route by that score. Low confidence escalates to a human. High confidence flows through. You can also let the system answer the same question more than one way, and treat disagreement as the flag.

None of this happens on its own. It is something you specify and build. The shift in mindset matters more than the mechanics. You are not asking the model to be right more often. You are asking it to be honest about when it is not. That honesty is an engineering requirement, not a trait you can hope for.

2. Govern the work before it is done, instead of at completion

AI needs onboarding and training just like junior staff. New hires don't just need the task. They need the firm's boundaries, its processes, the way things are done here and the lines you don't cross. AI is no different. The higher-leverage control sits upstream, in what you teach the system before it ever touches live work. Constrain what it is allowed to attempt, and on what terms.

Then give the system rules to work inside, the way you'd give a new hire a checklist. Route every task through a simple gate first. Can this be done by a rule? If yes, a deterministic system does it, and the answer is right by construction. The prior installment made this point. An error the design rules out is one you never have to catch.

The same logic carries past the gate. Where you can, replace inspection with checks the answer has to pass. Citations. Reconciliations. Range tests. A number that doesn't reconcile flags itself, no matter how confident the model sounded. Stop trying to catch a wrong answer by reading it. Build the work so a wrong answer can't pass without tripping a check it can't talk its way around.

3. Give the system the memory it doesn't have

AI will not remember the corrections we gave it last March unless we explicitly train it to.

Think about how we learned our craft. A mistake on a return, a partner's note in the margin, a lesson that stung enough to stay with us for the rest of our careers. We carry it forward without trying. The system can't, so its memory has to be built by hand. A logged rule. A test case. A new gate. Built this way, each correction strengthens the review process itself. That is the foundation for work that gets better every year.

The partner has always signed for work they didn't personally do. AI changes the staffer, not the signature.

This has been true since the first associate was hired. A partner's job was never only to produce the work. They own the relationship. They review what can't be delegated. The work goes out under their name, and they answer for it.

So where does a firm start?

Start with the work nobody wants to do. The administrative grind, not the judgment. The tasks that repeat, that follow a rule, that have few edge cases and a clear right answer. These are the jobs AI is built for, and the ones where a mistake is cheap to catch and easy to fix.

Win there first. Prove it works with your staff in the loop. Let them see where it holds and where it doesn't, and build their confidence on the easy problems before they try anything harder.

Then stack the wins. Each one earns the next. Move to the next process only when the last one is running clean.

Stay on top of every one of them. Someone owns the result. Someone checks it on a set cadence and confirms it is still holding. And the trail is captured as the work runs, so when a number is questioned, you can show how it got there. The human stays in the loop on the work that carries risk, and steps back from the work that runs like a calculator.

That is the whole method. Start small, prove it, keep a human accountable, and move up only as the system earns it.

The firms getting this right did not treat AI as a special case. They saw what they already had. A governance structure built for this exact problem, strong enough to carry a new tool without bending. What they added was the one new piece: the machinery that makes a silent system raise its hand.

The profession's product is trust. We have always produced it the same way, with probabilistic people governed by a deterministic process. AI is a new tool, and a powerful one, for firms that put it inside that process. Firms that call AI ungovernable, and firms that trust it to govern itself, make the same mistake in opposite directions. Delegate carefully. Review rigorously. The signature at the bottom is still yours, no matter who or what produced the work above it.


For reprint and licensing requests for this article, click here.
Technology Practice management Artificial Intelligence Data governance Data quality
MORE FROM ACCOUNTING TODAY
Load More