Assistant Evaluation Basics (3-metric core)
Use this to get a fast, defensible quality baseline before optimization.
Use this when: you need a quality baseline ยท Youโll leave with: 3 metrics that align teams ยท Fun cue: run it like a mini lab experiment
Metrics
- Task success rate โ % completed without escalation.
- Time to correct outcome โ median time from prompt to usable result.
- User trust rating โ confidence score after each task.
Protocol
- Pick 3 representative user tasks.
- Run 5โ8 moderated sessions.
- Capture handoff failures and correction loops.
- Prioritize fixes by impact on success + trust.
Output template
Task | Success | Time | Trust | Failure mode | Next fix
Next best step:
Handoff Friction Mapping
Need plain-language onboarding content? Use Zero-to-useful AI setup.