Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Mag.AI-Marketing — Assessment Framework

Castalia Institute — Magister of Artificial Intelligence in Marketing

Authoritative copy: part of the MyST project (myst.yml).

This document defines how students are evaluated: dimensions, rubrics, term deliverables, SAMWISE journals, faculty defenses, thesis requirements, grading outcomes, and anti-gaming rules aligned with the broader Magisterium evaluation philosophy.


1. Assessment philosophy

No exams

Mag.AI-Marketing does not use timed examinations as a primary measure of mastery. Competence is demonstrated through artifacts that can be inspected, run, and questioned. Coursework and thesis components are expected to live in iNQspace (or equivalent approved lineage) so reviewers can rerun and trace iterations. Where MCP connects to live marketing systems, Dimension D (Deployment) expects documented tool scope, credentials handling, and what was actually invoked — not undocumented API access.

Artifact-based

Evaluation centers on executable marketing world models, simulations, deployments, and evidence (logs, metrics, validation memos). If it cannot be shown, it cannot be assumed.

Iterative

Credit accrues through revision cycles: models improve under perturbation, reflection, and external challenge.

External scrutiny

Self-assessment alone is insufficient. Faculty review, SAMWISE-assisted reflection (with human oversight), and defense-style questioning provide independent pressure on claims.

Calibration (program responsibility)

Artifact grading scales poorly without norming. The program should:

This reduces inflation and inconsistent bars across the eighteen courses.


2. Five assessment dimensions (1–5 scale)

Each dimension is scored 1–5 by reviewers. Scores are evidence-backed; narrative without artifacts caps scores.

Dimension A — World Construction

Question: Is the ontology clear (audience, message, channel, measurement)? Are rules encoded faithfully?

ScoreDescriptor
5 — ExceptionalOntology is crisp; entities/relationships defined; rules match assumptions; edge cases scoped.
4 — StrongCoherent world with minor ambiguities; rules mostly explicit.
3 — CompetentUnderstandable model; some hand-waving; known inconsistencies listed.
2 — MarginalConfused ontology or hidden rules.
1 — InsufficientNot a formal world model; critical mechanics missing.

Evidence: schema/notes, rule definitions, assumption list, change log.


Dimension B — Simulation Quality

Question: Does the simulation run robustly, cover meaningful scenarios, and produce interpretable outputs?

ScoreDescriptor
5 — ExceptionalBroad scenario coverage; sensitivity analysis; traceable outputs; failure modes explored.
4 — StrongSolid scenarios; good diagnostics; minor blind spots.
3 — CompetentRuns end-to-end; limited scenarios.
2 — MarginalBrittle runs; thin scenarios.
1 — InsufficientNon-reproducible or outputs not tied to mechanics.

Evidence: runnable artifact, scenario definitions, seeds/parameters, notebooks.


Dimension C — Insight

Question: Can the student interpret outcomes, identify drivers, and separate signal from vanity metrics?

ScoreDescriptor
5 — ExceptionalSharp causal reasoning; challenges own priors with evidence.
4 — StrongClear interpretation; honest limitations.
3 — CompetentMostly correct interpretation; some metric confusion.
2 — MarginalStorytelling without mechanism linkage.
1 — InsufficientConclusions unsupported by runs.

Evidence: analysis memo, annotated results, SAMWISE journal excerpts.


Dimension D — Deployment

Question: Did the student connect the model to reality: deployment, measurement, honest accounting?

ScoreDescriptor
5 — ExceptionalReal deployment; clear measurement; ethical boundaries respected.
4 — StrongReal integration; credible metrics.
3 — CompetentPartial deployment or proxy measures.
2 — MarginalToy deployment; weak metrics.
1 — InsufficientNo real deployment where required.

Evidence: deployment description, metrics methodology, before/after reasoning.


Dimension E — Iteration

Question: Is there a credible improvement trajectory across versions and feedback?

ScoreDescriptor
5 — ExceptionalStrong revision history; targeted fixes; fork/compare discipline.
4 — StrongClear iterations; good documentation.
3 — CompetentSome iteration visible; uneven documentation.
2 — MarginalMostly one-shot edits.
1 — InsufficientNo meaningful iteration.

Evidence: version history, changelog, dated artifacts, defense Q&A.


Composite use (guidance)

Typical competency bar for a course artifact: mean dimension score ≥ 3.0 with no dimension below 2.5, unless a syllabus specifies otherwise. Thesis defense uses stricter thresholds (program policy).


3. Per-term assessment

End of Term I — “Audience & Message Worlds” portfolio

End of Term II — “Campaign & Growth Worlds” portfolio

End of Term III — “Strategic Worlds” + thesis


4. SAMWISE journal requirements

Strong journals include mechanism-linked specificity, before/after belief updates, honest limitations, and cross-links between courses.

Weak journals include vague summaries and claims without pointers to runs or metrics.

AI-generated journal text must be disclosed where required; undisclosed substitution is misconduct.


5. Faculty defense protocol

Same structure as Mag.AI-Business: presentation, panel Q&A, artifact deep dive, deliberation, verdict. Panel composition guidance: minimum 3 reviewers; at least one Mag.AI-Marketing reviewer; external recommended for thesis.


6. Magisterium Thesis requirements (AINS-M6206)

The thesis proves: marketing is a world model with attention, trust, and measurable persuasion — validated in simulation and reality.

Required components

  1. Formal world model — ontology, rules, parameters, boundaries.

  2. Simulation results — scenarios, sensitivity, limits of the model.

  3. Real-world deployment — defined boundaries (human vs automated).

  4. Measurable outcome — methodology; humility about confounders.

  5. Evidence chain — reproducible artifacts and documentation.


7. Grading: Pass / Revise & Resubmit / Fail

Aligned with Magisterium standards: complete artifacts, evidence-backed claims, no integrity violations. Revise & Resubmit when merit exists but gaps are specific and fixable.


8. Anti-gaming provisions

  1. Artifacts must exist and run.

  2. Baselines expected where improvement is claimed.

  3. Reproducibility may be checked.

  4. Defenses are live.

  5. Deployments require evidence.

  6. Iteration must be visible.

Fabricated metrics, deployments, or undisclosed misrepresentation → Fail and possible program sanctions.


“The Castalia Institute Magisterium confers proprietary credentials based on demonstrated work and evaluation. These credentials are not accredited academic degrees and do not confer professional licensure.”