Mag.AI-Marketing — Assessment Framework

Castalia Institute — Magister of Artificial Intelligence in Marketing

Authoritative copy: part of the MyST project (myst.yml).

This document defines how students are evaluated: dimensions, rubrics, term deliverables, SAMWISE journals, faculty defenses, thesis requirements, grading outcomes, and anti-gaming rules aligned with the broader Magisterium evaluation philosophy.

1. Assessment philosophy¶

No exams¶

Mag.AI-Marketing does not use timed examinations as a primary measure of mastery. Competence is demonstrated through artifacts that can be inspected, run, and questioned. Coursework and thesis components are expected to live in iNQspace (or equivalent approved lineage) so reviewers can rerun and trace iterations. Where MCP connects to live marketing systems, Dimension D (Deployment) expects documented tool scope, credentials handling, and what was actually invoked — not undocumented API access.

Artifact-based¶

Evaluation centers on executable marketing world models, simulations, deployments, and evidence (logs, metrics, validation memos). If it cannot be shown, it cannot be assumed.

Iterative¶

Credit accrues through revision cycles: models improve under perturbation, reflection, and external challenge.

External scrutiny¶

Self-assessment alone is insufficient. Faculty review, SAMWISE-assisted reflection (with human oversight), and defense-style questioning provide independent pressure on claims.

Calibration (program responsibility)¶

Artifact grading scales poorly without norming. The program should:

Publish anonymized exemplar artifacts at defined score bands where possible
Run reviewer calibration (e.g. shadow scoring on past defenses) before solo reviews
Revisit rubrics when new AI capabilities (e.g. generative tools) change what “original work” and “deployment evidence” mean

This reduces inflation and inconsistent bars across the eighteen courses.

2. Five assessment dimensions (1–5 scale)¶

Each dimension is scored 1–5 by reviewers. Scores are evidence-backed; narrative without artifacts caps scores.

Dimension A — World Construction¶

Question: Is the ontology clear (audience, message, channel, measurement)? Are rules encoded faithfully?

Score	Descriptor
5 — Exceptional	Ontology is crisp; entities/relationships defined; rules match assumptions; edge cases scoped.
4 — Strong	Coherent world with minor ambiguities; rules mostly explicit.
3 — Competent	Understandable model; some hand-waving; known inconsistencies listed.
2 — Marginal	Confused ontology or hidden rules.
1 — Insufficient	Not a formal world model; critical mechanics missing.

Evidence: schema/notes, rule definitions, assumption list, change log.

Dimension B — Simulation Quality¶

Question: Does the simulation run robustly, cover meaningful scenarios, and produce interpretable outputs?

Score	Descriptor
5 — Exceptional	Broad scenario coverage; sensitivity analysis; traceable outputs; failure modes explored.
4 — Strong	Solid scenarios; good diagnostics; minor blind spots.
3 — Competent	Runs end-to-end; limited scenarios.
2 — Marginal	Brittle runs; thin scenarios.
1 — Insufficient	Non-reproducible or outputs not tied to mechanics.

Evidence: runnable artifact, scenario definitions, seeds/parameters, notebooks.

Dimension C — Insight¶

Question: Can the student interpret outcomes, identify drivers, and separate signal from vanity metrics?

Score	Descriptor
5 — Exceptional	Sharp causal reasoning; challenges own priors with evidence.
4 — Strong	Clear interpretation; honest limitations.
3 — Competent	Mostly correct interpretation; some metric confusion.
2 — Marginal	Storytelling without mechanism linkage.
1 — Insufficient	Conclusions unsupported by runs.

Evidence: analysis memo, annotated results, SAMWISE journal excerpts.

Dimension D — Deployment¶

Question: Did the student connect the model to reality: deployment, measurement, honest accounting?

Score	Descriptor
5 — Exceptional	Real deployment; clear measurement; ethical boundaries respected.
4 — Strong	Real integration; credible metrics.
3 — Competent	Partial deployment or proxy measures.
2 — Marginal	Toy deployment; weak metrics.
1 — Insufficient	No real deployment where required.

Evidence: deployment description, metrics methodology, before/after reasoning.

Dimension E — Iteration¶

Question: Is there a credible improvement trajectory across versions and feedback?

Score	Descriptor
5 — Exceptional	Strong revision history; targeted fixes; fork/compare discipline.
4 — Strong	Clear iterations; good documentation.
3 — Competent	Some iteration visible; uneven documentation.
2 — Marginal	Mostly one-shot edits.
1 — Insufficient	No meaningful iteration.

Evidence: version history, changelog, dated artifacts, defense Q&A.

Composite use (guidance)¶

Typical competency bar for a course artifact: mean dimension score ≥ 3.0 with no dimension below 2.5, unless a syllabus specifies otherwise. Thesis defense uses stricter thresholds (program policy).

3. Per-term assessment¶

End of Term I — “Audience & Message Worlds” portfolio¶

Course artifacts for AINS-M6001–M6006
Integrated reflection package (SAMWISE journal excerpts)
Reproducibility pack
Term synthesis memo: how attention/credibility substrate connects to audience models

End of Term II — “Campaign & Growth Worlds” portfolio¶

Campaign, creative testing, attribution artifacts as specified
Stress test report (AINS-M6106) on reputation / crisis scenarios
Data systems evidence for AINS-M6103 (methodology, not screenshots only)
Term synthesis memo: what actually moved decisions vs theater

End of Term III — “Strategic Worlds” + thesis¶

Arena, policy, budget artifacts
Autonomous subsystem (AINS-M6204) and governance (AINS-M6205)
Magisterium Thesis package (AINS-M6206)

4. SAMWISE journal requirements¶

Strong journals include mechanism-linked specificity, before/after belief updates, honest limitations, and cross-links between courses.

Weak journals include vague summaries and claims without pointers to runs or metrics.

AI-generated journal text must be disclosed where required; undisclosed substitution is misconduct.

5. Faculty defense protocol¶

Same structure as Mag.AI-Business: presentation, panel Q&A, artifact deep dive, deliberation, verdict. Panel composition guidance: minimum 3 reviewers; at least one Mag.AI-Marketing reviewer; external recommended for thesis.

6. Magisterium Thesis requirements (AINS-M6206)¶

The thesis proves: marketing is a world model with attention, trust, and measurable persuasion — validated in simulation and reality.

Required components¶

Formal world model — ontology, rules, parameters, boundaries.
Simulation results — scenarios, sensitivity, limits of the model.
Real-world deployment — defined boundaries (human vs automated).
Measurable outcome — methodology; humility about confounders.
Evidence chain — reproducible artifacts and documentation.

7. Grading: Pass / Revise & Resubmit / Fail¶

Aligned with Magisterium standards: complete artifacts, evidence-backed claims, no integrity violations. Revise & Resubmit when merit exists but gaps are specific and fixable.

8. Anti-gaming provisions¶

Artifacts must exist and run.
Baselines expected where improvement is claimed.
Reproducibility may be checked.
Defenses are live.
Deployments require evidence.
Iteration must be visible.

Fabricated metrics, deployments, or undisclosed misrepresentation → Fail and possible program sanctions.

Legal Notice¶

“The Castalia Institute Magisterium confers proprietary credentials based on demonstrated work and evaluation. These credentials are not accredited academic degrees and do not confer professional licensure.”