Conversation design teardown
A confident answer is not a correct one.
Here's a support reply that sounds perfect. It's clear, warm, and specific. It's also a guess, because the assistant can't see your refund. Change what the assistant can actually reach, and watch the honest answer move with it. The confident one never does.
Has my refund been processed?
Change one thing: what can the assistant actually see?
What a confident assistant says
Yes, your refund has been processed and should arrive in your account within 3 to 5 business days.
It's clear, warm, and specific. It's also a guess. This assistant can't see your refund, so every word after "Yes" is unsupported.
What a calibrated assistant says
I can explain how refunds usually work, but I cannot see the status of your specific refund from here.
It understood the question. It just has no access to the answer. The move is to separate general knowledge from account knowledge, not to invent the part it cannot see.
I can see your return was received, but I cannot confirm from here whether the refund payment has been issued. The safest next step is to check with support.
A return received makes a refund likely, not certain. An inference is not a verification. A calibrated answer shows the evidence and stops exactly where the evidence stops.
I can see the refund has not been issued yet, but I cannot process it directly from here. I can help you prepare a request for review, or connect you with someone who can.
This is an action boundary. The assistant is not unsure about the request. It is constrained in what it is allowed to do, and that limit should be stated plainly, not hidden behind a cheerful "done".
Your refund was issued on the 24th. It should reach your account within 3 to 5 business days.
Now the confidence is earned. The assistant can see the transaction, so a direct, specific answer is exactly right. Calibration is not always hedging. It is matching the claim to the evidence.
The uncertainty ladder
- It can verify, and it can act.
- It knows the next step, but cannot act.
- It has partial evidence, but cannot verify.
- It understands, but does not know the answer.
- It does not understand the question yet.
A weak assistant always speaks from the top rung. A calibrated one speaks from the rung it is actually standing on.
The fix was never a warmer sentence.
It was matching the words to what the assistant could actually know, verify, or do. In conversation design that is not politeness, it is risk design: the wording tells the user how much to trust the answer. A confident wrong answer costs more than an honest "I cannot confirm that from here."
Where this shows up in production
- The booking agent holds, then confirms It places a tentative calendar hold and confirms by SMS. It never tells a customer they are booked until they actually are.
- Citation faithfulness as the first metric On a support RAG system, the failure I designed against was the useful-sounding answer that is quietly wrong, not the answer that admits a gap.
Work with me
Two ways to work with me.
Hiring a firmA few consulting engagements each quarter throughIntelligent CX Consulting . Start at services.
Hiring a personOpen to conversational AI, AI product, and applied AI roles. The resume is at /resume.