Mag.AI-Marketing — Assessment Framework
Castalia Institute — Magister of Artificial Intelligence in Marketing
Authoritative copy: part of the MyST project (myst.yml).
This document defines how students are evaluated: dimensions, rubrics, term deliverables, SAMWISE journals, faculty defenses, thesis requirements, grading outcomes, and anti-gaming rules aligned with the broader Magisterium evaluation philosophy.
1. Assessment philosophy¶
No exams¶
Mag.AI-Marketing does not use timed examinations as a primary measure of mastery. Competence is demonstrated through artifacts that can be inspected, run, and questioned. Coursework and thesis components are expected to live in iNQspace (or equivalent approved lineage) so reviewers can rerun and trace iterations. Where MCP connects to live marketing systems, Dimension D (Deployment) expects documented tool scope, credentials handling, and what was actually invoked — not undocumented API access.
Artifact-based¶
Evaluation centers on executable marketing world models, simulations, deployments, and evidence (logs, metrics, validation memos). If it cannot be shown, it cannot be assumed.
Iterative¶
Credit accrues through revision cycles: models improve under perturbation, reflection, and external challenge.
External scrutiny¶
Self-assessment alone is insufficient. Faculty review, SAMWISE-assisted reflection (with human oversight), and defense-style questioning provide independent pressure on claims.
Calibration (program responsibility)¶
Artifact grading scales poorly without norming. The program should:
Publish anonymized exemplar artifacts at defined score bands where possible
Run reviewer calibration (e.g. shadow scoring on past defenses) before solo reviews
Revisit rubrics when new AI capabilities (e.g. generative tools) change what “original work” and “deployment evidence” mean
This reduces inflation and inconsistent bars across the eighteen courses.
2. Five assessment dimensions (1–5 scale)¶
Each dimension is scored 1–5 by reviewers. Scores are evidence-backed; narrative without artifacts caps scores.
Dimension A — World Construction¶
Question: Is the ontology clear (audience, message, channel, measurement)? Are rules encoded faithfully?
| Score | Descriptor |
|---|---|
| 5 — Exceptional | Ontology is crisp; entities/relationships defined; rules match assumptions; edge cases scoped. |
| 4 — Strong | Coherent world with minor ambiguities; rules mostly explicit. |
| 3 — Competent | Understandable model; some hand-waving; known inconsistencies listed. |
| 2 — Marginal | Confused ontology or hidden rules. |
| 1 — Insufficient | Not a formal world model; critical mechanics missing. |
Evidence: schema/notes, rule definitions, assumption list, change log.
Dimension B — Simulation Quality¶
Question: Does the simulation run robustly, cover meaningful scenarios, and produce interpretable outputs?
| Score | Descriptor |
|---|---|
| 5 — Exceptional | Broad scenario coverage; sensitivity analysis; traceable outputs; failure modes explored. |
| 4 — Strong | Solid scenarios; good diagnostics; minor blind spots. |
| 3 — Competent | Runs end-to-end; limited scenarios. |
| 2 — Marginal | Brittle runs; thin scenarios. |
| 1 — Insufficient | Non-reproducible or outputs not tied to mechanics. |
Evidence: runnable artifact, scenario definitions, seeds/parameters, notebooks.
Dimension C — Insight¶
Question: Can the student interpret outcomes, identify drivers, and separate signal from vanity metrics?
| Score | Descriptor |
|---|---|
| 5 — Exceptional | Sharp causal reasoning; challenges own priors with evidence. |
| 4 — Strong | Clear interpretation; honest limitations. |
| 3 — Competent | Mostly correct interpretation; some metric confusion. |
| 2 — Marginal | Storytelling without mechanism linkage. |
| 1 — Insufficient | Conclusions unsupported by runs. |
Evidence: analysis memo, annotated results, SAMWISE journal excerpts.
Dimension D — Deployment¶
Question: Did the student connect the model to reality: deployment, measurement, honest accounting?
| Score | Descriptor |
|---|---|
| 5 — Exceptional | Real deployment; clear measurement; ethical boundaries respected. |
| 4 — Strong | Real integration; credible metrics. |
| 3 — Competent | Partial deployment or proxy measures. |
| 2 — Marginal | Toy deployment; weak metrics. |
| 1 — Insufficient | No real deployment where required. |
Evidence: deployment description, metrics methodology, before/after reasoning.
Dimension E — Iteration¶
Question: Is there a credible improvement trajectory across versions and feedback?
| Score | Descriptor |
|---|---|
| 5 — Exceptional | Strong revision history; targeted fixes; fork/compare discipline. |
| 4 — Strong | Clear iterations; good documentation. |
| 3 — Competent | Some iteration visible; uneven documentation. |
| 2 — Marginal | Mostly one-shot edits. |
| 1 — Insufficient | No meaningful iteration. |
Evidence: version history, changelog, dated artifacts, defense Q&A.
Composite use (guidance)¶
Typical competency bar for a course artifact: mean dimension score ≥ 3.0 with no dimension below 2.5, unless a syllabus specifies otherwise. Thesis defense uses stricter thresholds (program policy).
3. Per-term assessment¶
End of Term I — “Audience & Message Worlds” portfolio¶
Course artifacts for AINS-M6001–M6006
Integrated reflection package (SAMWISE journal excerpts)
Reproducibility pack
Term synthesis memo: how attention/credibility substrate connects to audience models
End of Term II — “Campaign & Growth Worlds” portfolio¶
Campaign, creative testing, attribution artifacts as specified
Stress test report (AINS-M6106) on reputation / crisis scenarios
Data systems evidence for AINS-M6103 (methodology, not screenshots only)
Term synthesis memo: what actually moved decisions vs theater
End of Term III — “Strategic Worlds” + thesis¶
Arena, policy, budget artifacts
Autonomous subsystem (AINS-M6204) and governance (AINS-M6205)
Magisterium Thesis package (AINS-M6206)
4. SAMWISE journal requirements¶
Strong journals include mechanism-linked specificity, before/after belief updates, honest limitations, and cross-links between courses.
Weak journals include vague summaries and claims without pointers to runs or metrics.
AI-generated journal text must be disclosed where required; undisclosed substitution is misconduct.
5. Faculty defense protocol¶
Same structure as Mag.AI-Business: presentation, panel Q&A, artifact deep dive, deliberation, verdict. Panel composition guidance: minimum 3 reviewers; at least one Mag.AI-Marketing reviewer; external recommended for thesis.
6. Magisterium Thesis requirements (AINS-M6206)¶
The thesis proves: marketing is a world model with attention, trust, and measurable persuasion — validated in simulation and reality.
Required components¶
Formal world model — ontology, rules, parameters, boundaries.
Simulation results — scenarios, sensitivity, limits of the model.
Real-world deployment — defined boundaries (human vs automated).
Measurable outcome — methodology; humility about confounders.
Evidence chain — reproducible artifacts and documentation.
7. Grading: Pass / Revise & Resubmit / Fail¶
Aligned with Magisterium standards: complete artifacts, evidence-backed claims, no integrity violations. Revise & Resubmit when merit exists but gaps are specific and fixable.
8. Anti-gaming provisions¶
Artifacts must exist and run.
Baselines expected where improvement is claimed.
Reproducibility may be checked.
Defenses are live.
Deployments require evidence.
Iteration must be visible.
Fabricated metrics, deployments, or undisclosed misrepresentation → Fail and possible program sanctions.
Legal Notice¶
“The Castalia Institute Magisterium confers proprietary credentials based on demonstrated work and evaluation. These credentials are not accredited academic degrees and do not confer professional licensure.”