Evaluate the quality of your AI engagement practice across four dimensions — scored by you.
Most AI evaluations stop at the output: was it good, was it accurate, was it fit for purpose. This tool goes one level further.
The Meta framework evaluates the practice that produced the output. It asks four questions: did you plan the engagement with intent, did you use and iterate on the conversation effectively, did you reflect honestly on what your practice produced, and did you maintain independent judgement throughout.
If you have run the Output Evaluator Rubric and found Adequate on Analytical Depth, this tool will help you understand why.
You need the AI engagement you want to evaluate. The framework works with three types of evidence: the full conversation, the conversation alongside other outputs from the same project, or a self-report where you describe your practice without direct access to the exchange.
Each evidence type produces a valid evaluation but with different visibility limits. A self-report is honest about what it is: you are describing your practice rather than demonstrating it. The scoring should reflect that distinction.
This evaluation takes ten to fifteen minutes.
The framework has four dimensions, scored in order. Deliberate Planning covers what happened before the conversation started. Effective Use and Iteration covers how you worked the conversation once underway. Self-Evaluation of AI Use covers whether you looked back honestly at what your practice produced. Critical Thinking covers whether you maintained independent judgement throughout.
After the four dimensions, you score Practice Coherence, which asks how well the four dimensions hang together as an integrated whole rather than isolated competencies. Add notes at each dimension if they would be useful to you later.
Each dimension is scored across five bands. The descriptions are intentionally honest.
Adequate means practice passed a minimum threshold and nothing more. Capable is genuinely strong. Exemplary means nothing more could reasonably be asked of your practice at this stage. Score what you actually observe, not what you intended.
At the end, you can download a structured PDF of your evaluation. It records your scores, your notes, and a summary, formatted for reference, record-keeping, or sharing.
If you want to see how an AI-generated assessment of the same engagement compares with your own, that is available as a separate tool. The differences between the two evaluations tend to be as instructive as the scores themselves.
Before scoring, describe the AI engagement you are evaluating and how you approached it.
Work through each dimension in order. Select the band that best describes what you observe in your own practice. Add notes if they would be useful to you later.
After scoring the four dimensions, consider your practice as a whole for this final measure. Higher scores are better: a score of 9 to 10 means your four dimensions are working together as an integrated, intentional practice.
In your own words, what are the two or three most significant things you found? What single change would most improve your AI engagement practice?
Draw on your dimensional scores and notes. You do not need to repeat every detail, just the things that matter most.
| Dimension | Band | Score |
|---|---|---|
| Deliberate Planning | — | — |
| Effective Use and Iteration | — | — |
| Self-Evaluation of AI Use | — | — |
| Critical Thinking | — | — |
| Practice Coherence (holistic) | — | — |
When you are satisfied with your evaluation, generate your PDF record.