Turns Out the Problem Was Never the AI

Iris Puyue
3 days ago
2 min read

At JGM Innovation, we've been building tools that put AI to work for workforce development. One of them is the AI Audit Bot, a system designed to review our analytics blog posts against a defined set of editorial standards before publication.

The idea was straightforward: give the AI a data analytics blog draft, get back a structured audit report. Flag weak titles, missing sources, undefined acronyms, jargon that needs plain-English rewrites, and make sure every post lands in an attractive, reader-friendly way. My colleague Matt and I built the instruction together, tested it, and waited for the output.

The feedback came back. And it didn't strictly follow our instruction.

My first instinct was to blame the model. But before doing that, I actually sat down and reread what I had written. That's when I found the real problem, and it wasn't the AI.

I had given it evaluation labels: Pass, Needs Revision, Major Revision. But I never defined what conditions trigger each verdict. I told it to flag missing elements, but never listed what the required elements are. I assigned three severity tiers, Critical, Important, and Minor, but never explained how to tell them apart.

The AI wasn't failing. It was filling in blanks I had left open.

So I rewrote the rubrics. I added explicit criteria for every judgment call. I ran it again with the same model, and the output was noticeably sharper.

The lesson wasn't really about AI. It was about clarity. If your own evaluation criteria are ambiguous to you, they are invisible to the model. Prompt engineering, at its core, is just the discipline of thinking precisely about your own standards before asking someone else, human or AI, to apply them.

That's a skill worth building, whatever tools you're working with.

Written by Iris Niu, Data Analytics Intern, JGM Innovation. In collaboration with Matthew Tso, Innovation Team.

Turns Out the Problem Was Never the AI

Related Posts

Comments

SUBSCRIBE