Engagement Evaluator — Meta | Self Assessment

You have chosen to self-assess

The Meta framework can be evaluated by AI, or you can conduct the evaluation yourself. You have chosen to evaluate your own practice. This tool provides the structure; you bring the judgement.

What this is

The purpose

Most AI evaluations stop at the output: was it good, was it accurate, was it fit for purpose. This tool goes one level further.

The Meta framework evaluates the practice that produced the output. It asks four questions: did you plan the engagement with intent, did you use and iterate on the conversation effectively, did you reflect honestly on what your practice produced, and did you maintain independent judgement throughout.

If you have run the Output Evaluator Rubric and found Adequate on Analytical Depth, this tool will help you understand why.

Before you start

What to have ready

You need the AI engagement you want to evaluate. The framework works with three types of evidence: the full conversation, the conversation alongside other outputs from the same project, or a self-report where you describe your practice without direct access to the exchange.

Each evidence type produces a valid evaluation but with different visibility limits. A self-report is honest about what it is: you are describing your practice rather than demonstrating it. The scoring should reflect that distinction.

This evaluation takes ten to fifteen minutes.

How it works

The process

The framework has four dimensions, scored in order. Deliberate Planning covers what happened before the conversation started. Effective Use and Iteration covers how you worked the conversation once underway. Self-Evaluation of AI Use covers whether you looked back honestly at what your practice produced. Critical Thinking covers whether you maintained independent judgement throughout.

After the four dimensions, you score Practice Coherence, which asks how well the four dimensions hang together as an integrated whole rather than isolated competencies. Add notes at each dimension if they would be useful to you later.

Scoring

The scale

Each dimension is scored across five bands. The descriptions are intentionally honest.

Insufficient 0 — 2

Partial 3 — 4

Adequate 5 — 6

Capable 7 — 8

Exemplary 9 — 10

Adequate means practice passed a minimum threshold and nothing more. Capable is genuinely strong. Exemplary means nothing more could reasonably be asked of your practice at this stage. Score what you actually observe, not what you intended.

The output

What you get

At the end, you can download a structured PDF of your evaluation. It records your scores, your notes, and a summary, formatted for reference, record-keeping, or sharing.

If you want to see how an AI-generated assessment of the same engagement compares with your own, that is available as a separate tool. The differences between the two evaluations tend to be as instructive as the scores themselves.

Dimension	Band	Score
Deliberate Planning	—	—
Effective Use and Iteration	—	—
Self-Evaluation of AI Use	—	—
Critical Thinking	—	—
Practice Coherence (holistic)	—	—

Self-Assessment

The purpose

What to have ready

The process

The scale

What you get