MATS ML Alignment Theory Scholars program
- Links15 links could use <R> components
Quick Assessment
Section titled “Quick Assessment”| Aspect | Rating | Notes |
|---|---|---|
| Program Scale | High | 98 scholars and 57 mentors in most recent cohort (MATS 8.0, Summer 2025)1 |
| Research Output | Strong | 160+ publications, 8,000+ citations, h-index of 40 over 4 years2 |
| Career Impact | Very High | 80% of alumni work in AI alignment; placements at AnthropicLabAnthropicComprehensive profile of Anthropic tracking its rapid commercial growth (from $1B to $7B annualized revenue in 2025, 42% enterprise coding market share) alongside safety research (Constitutional AI...Quality: 51/100, OpenAILabOpenAIComprehensive organizational profile of OpenAI documenting evolution from 2015 non-profit to commercial AGI developer, with detailed analysis of governance crisis, safety researcher exodus (75% of ...Quality: 46/100, DeepMindLabGoogle DeepMindComprehensive overview of DeepMind's history, achievements (AlphaGo, AlphaFold with 200M+ protein structures), and 2023 merger with Google Brain. Documents racing dynamics with OpenAI and new Front...Quality: 37/1003 |
| Funding per Scholar | $27k | $15k stipend + $12k compute resources, plus housing and meals4 |
| Selectivity | Very Competitive | ≈15% acceptance rate; 40+ mentors with independent selection5 |
Overview
Section titled “Overview”The ML Alignment & Theory Scholars (MATS) Program is a 12-week educational and research fellowship designed to develop talented researchers in AI alignment, governance, and security through intensive mentorship and independent research.6 Founded in 2021 and initially run as SERI MATS under the Stanford Existential Risks Initiative, the program became independent by Summer 2023 and now operates in-person cohorts in Berkeley, California and London, United Kingdom.7
MATS pairs scholars with leading researchers in AI safety for approximately 1-2 hours of mentorship per week, supplemented by seminars, workshops, guest lectures, and dedicated research manager support.8 The program provides comprehensive support including a $15,000 living stipend, $12,000 in compute resources, private housing, catered meals, and office space.9 Scholars develop independent research projects that culminate in presentations at a Scholar Symposium, with selected fellows invited to continue for 6-12 month extensions.
Since its first cohort in early 2022, MATS has supported 213 scholars and 47 mentors across its first five seasonal programs, expanding to 98 scholars and 57 mentors by MATS 8.0 in Summer 2025.10 The program has generated over 160 research publications with more than 8,000 citations, advancing agendas in mechanistic interpretability, sparse feature analysis, activation engineering, and AI safety evaluation.11 Alumni have founded new AI safety organizations like Apollo Research and secured positions at major AI labs, with 80% remaining in alignment-related work.12
History
Section titled “History”Founding and Early Development
Section titled “Founding and Early Development”MATS originated as SERI MATS, an initiative under the Strategic Research Institute (SERI) focused on AI safety research training, launching its first programs in early 2022.13 The initial program structure included a 4-week online upskilling phase (10 hours per week), a 2-week research sprint, and an 8-week intensive in-person program in Berkeley.14 Early mentors included Alex Gray, Beth Barnes, Evan Hubinger, John Wentworth, Leo Gao, and Stuart Armstrong.15
The program’s core mission from inception was to train talented individuals for AI alignment research by addressing risks from unaligned AI through mentorship, training, logistics, and community access.16 By Summer 2023, the program had evolved into an independent organization while maintaining its Berkeley and London hubs.17
Program Evolution and Growth
Section titled “Program Evolution and Growth”Over its first four years, MATS iterated significantly on its structure and curriculum:
Summer 2022: The first cohort produced notable outcomes, including scholars like Johannes Treutlein working under Evan Hubinger, who co-authored papers on predictive models that were later published at the UAI 2023 conference.18
Summer 2023 (4th Iteration): This cohort expanded to 60 scholars and 15 mentors, with 461 applicants (15% acceptance rate for the Training Phase).19 The program introduced the Scholar Research Plan (SRP) requiring a threat model, theory of change, and SMART plan, and implemented distinct phases: Training (Alignment 201), Research (Berkeley), and Extension (London/Berkeley).
Winter 2023-24 (5th Iteration): Further growth to 63 scholars and 20 mentors, with a significant curriculum change replacing Alignment 201 with custom curricula due to feedback.20 This included Neel Nanda’s remote mechanistic interpretability curriculum (November 20-December 22) and AI Safety Strategy Discussions.
MATS 8.0 (Summer 2025): The program reached 98 scholars and 57 mentors, concluding with a symposium on August 22, 2025 featuring 10 spotlight talks and a poster session.21
By May 2024, MATS had supported 213 scholars and 47 mentors across five seasonal programs, presenting insights on talent selection and development at the TAIS 2024 conference.22
Program Structure and Support
Section titled “Program Structure and Support”Core Components
Section titled “Core Components”MATS operates as a 12-week in-person fellowship with several key elements:
Mentorship: Scholars receive approximately 1-2 hours per week of one-on-one mentorship from established researchers via Slack or direct communication.23 Each mentor conducts their own selection process, with some using work tasks and others conducting interviews focused on ML experience, research proposals, and conceptual alignment questions rather than behavioral assessments.24
Research Development: Scholars develop a Research Plan approximately one month into the program, outlining their threat model, theory of change, and specific deliverables.25 Dedicated research managers provide support for scoping projects, maintaining progress, and removing obstacles throughout the fellowship.26
Educational Programming: The program includes seminars and workshops 2-3 times per week, featuring speakers from organizations like Redwood ResearchRedwood ResearchA nonprofit AI safety and security research organization founded in 2021, known for pioneering AI Control research, developing causal scrubbing interpretability methods, and conducting landmark ali...Quality: 78/100, FAR AI, OpenAI, CHAI, and GovAI.27 Past speakers have included Buck Shlegeris, Adam Gleave, William Saunders, Andrew Critch, Lennart Heim, and Ajeya Cotra.
Research Tracks: MATS offers multiple specialization areas including technical governance, empirical research, policy & strategy, theory, and compute governance.28
Financial and Logistical Support
Section titled “Financial and Logistical Support”The program provides comprehensive material support valued at approximately $35,000 per scholar:29
- $15,000 stipend for living expenses (provided by AI Safety Support)30
- $12,000 compute budget for experiments and evaluations31
- Private housing for the full program duration in Berkeley or London32
- Office space access and catered meals33
- Travel reimbursement where applicable
Extension Opportunities
Section titled “Extension Opportunities”Selected scholars may continue for an additional 6-12 months through extension programs in London, Berkeley, Boston, or Washington D.C., with MATS arranging funding to cover monthly stipends, compute resources, housing, and office rent.34 Historically, approximately 70% of scholars have been accepted for extensions based on research plans and mentor endorsements.35
Research Impact and Outcomes
Section titled “Research Impact and Outcomes”Publications and Citations
Section titled “Publications and Citations”Over four years, MATS has produced significant research output, with alumni generating over 160 publications that have received more than 8,000 citations, yielding an organizational h-index of 40.36 Notable publications include:
- Steering Llama 2 via Contrastive Activation Addition (Outstanding Paper Award at ACL 2024)37
- Conditioning Predictive Models: Risks and Strategies (published at UAI 2023)38
- Incentivizing Honest Performative Predictions with Proper Scoring Rules (UAI 2023)
- Neural Networks Learn Statistics of Increasing Complexity
- Copy Suppression, Inverse Scaling, The Reasons That Agents Act
In a survey of alumni from the first four programs (46% response rate), 78% reported their key publication “possibly” or “probably” would not have happened without MATS, with 10% accelerated by more than 6 months and 14% accelerated by 1-6 months.39
Research Agendas Developed
Section titled “Research Agendas Developed”MATS scholars have advanced numerous technical agendas in AI safety:
- Sparse auto-encoders for AI interpretability40
- Activation and representation engineering
- Emergent misalignment detection
- Inoculation prompting techniques
- Developmental interpretability
- Computational mechanics applications
- Glitch token analysis
- Situational awareness evaluations
- Gradient routing methods
- Externalized reasoning oversight
- Formalizing natural abstractions
These research directions span mechanistic interpretability, sparse feature analysis, and studies of latent representations in AI systems.41
Career Outcomes
Section titled “Career Outcomes”MATS has achieved strong career placement results for alumni:
Employment: 49% of surveyed alumni reported working or interning on AI alignment or control, with 29% conducting independent alignment research.42 Among earlier cohorts, 39% were hired by research organizations post-MATS, with 50% indicating MATS made them “much more likely” to be hired.43 An additional 22% pursued Master’s or PhD programs.
Organizational Placements: Alumni have joined nearly every major AI safety initiative, including Anthropic, OpenAI, DeepMind, CHAI, and Redwood Research.44 Notable examples include:
- Nina (Summer 2023, mentored by Evan Hubinger): Joined Anthropic as a research scientist; won ACL 2024 Outstanding Paper Award; later mentored SPAR and MATS cohorts45
- Marius Hobbhahn (Winter 2022/23, mentored by Evan Hubinger): Founded and became CEO of Apollo Research, a London-based technical alignment organization focused on scheming evaluations and AI control46
- Johannes Treutlein (Summer 2022, mentored by Evan Hubinger): Pursued PhD at CHAI; joined Anthropic in 2024 for alignment stress-testing47
New Organizations: Alumni have founded new AI safety initiatives including Apollo Research, Cadenza Labs, PRISM Eval, and have organized conferences on singular learning theory and developmental interpretability.48
Skill Development: 49% of alumni reported MATS increased their research or technical skills, while 38% gained legible career capital.49
Key People
Section titled “Key People”Leadership
Section titled “Leadership”Ryan Kidd serves as Co-Executive Director of MATS and Co-Founder of the London Initiative for Safe AI (LISA).50 He was a scholar in MATS’s first iteration (which had only 5 scholars total) and has since become a Manifund Regrantor and advisor to organizations including Halcyon Futures, Catalyze Impact, AI Safety ANZ, and Pivotal Research.
Christian Smith serves as Co-Executive Director and Co-Founder of LISA.51 He brings a background in particle physics and pedagogy from Stanford University, having conducted research at CERN and organized educational programs like the Uncommon Sense Seminar.
Laura Vaughan, a Thiel Fellow (2017) who studied physics at the University of Waterloo, brings experience in ML model dataset creation and training, management, entrepreneurship, full-stack software engineering, and biomedical research.52 She co-founded a stem cell cryogenics startup before joining MATS.
Notable Mentors
Section titled “Notable Mentors”MATS mentors come from leading organizations including Anthropic, Google DeepMind, Redwood Research, OpenAI, MIRIOrganizationMIRIComprehensive organizational history documenting MIRI's trajectory from pioneering AI safety research (2000-2020) to policy advocacy after acknowledging research failure, with detailed financial da...Quality: 50/100, ARC (Alignment Research Center), CHAI, CAIS, and the Centre on Long-Term Risk.53 Selected examples include:
- Marius Hobbhahn: CEO and Director of Apollo Research focusing on evaluations for schemingRiskSchemingScheming—strategic AI deception during training—has transitioned from theoretical concern to observed behavior across all major frontier models (o1: 37% alignment faking, Claude: 14% harmful compli...Quality: 74/100 and control; PhD in Bayesian ML; formerly worked on AI forecasting at Epoch; was a MATS Winter 2022/23 scholar under Evan Hubinger54
- Sam Bowman: Leads AI alignment and welfare research at Anthropic with focus on evaluation; Associate Professor at NYU (on leave); has studied neural network language models since 201255
- Joe Benton: Member of Anthropic’s Alignment Science team working on scalable oversight, control, chain-of-thought monitoring, and evaluations56
- Arthur Conmy: Research Engineer at Google DeepMind on Language Model Interpretability with Neel Nanda; previously conducted influential work on automating interpretability at Redwood Research57
- Evan Hubinger: Provided mentorship for early SERI MATS trials and multiple cohorts; formerly at MIRI, now at Anthropic58
- Neel Nanda: From Google DeepMind; created custom mechanistic interpretability curriculum for MATS including sessions on sparse autoencoders and superposition toy models59
Funding
Section titled “Funding”MATS operates as a non-profit fellowship program sustained through grants and donations rather than generating revenue.60 The program does not coordinate funding directly but relies on partner organizations:
Primary Funding Sources:
- AI Safety Support: Provides the $15,000 stipend for each fellow completing the full program (prorated for partial participation)61
- MATS-arranged funding: Covers extension program costs including monthly stipends, compute, housing, and office rent for 6-12 month extensions62
- Open Philanthropy: Provided grants to support the early SERI MATS trial program under Evan Hubinger63
- Other supporters (2024): Foresight Institute, Survival and Flourishing Fund, Long-Term Future Fund, Craig Falls, and several donors via Manifund64
Per-Scholar Investment: The total cost per scholar is approximately $35,000 for the full program, based on recent cohorts of 60 scholars and 15 mentors.65 This includes the $15k stipend, $12k compute resources, and costs for housing, meals, office space, and program administration.
Historical Funding: In the 2022 SERI MATS program, scholars received $6,000 after the training and research sprint phase, $16,000 at program completion, plus ongoing discretionary funding, with all accommodation, office, and event expenses covered.66
Criticisms and Concerns
Section titled “Criticisms and Concerns”While MATS has achieved strong outcomes, program organizers and alumni have identified several concerns and limitations:
Field Growth Risks
Section titled “Field Growth Risks”Program organizers acknowledge concerns that MATS’s appeal—particularly access to scaling lab mentors—could attract aspiring AI researchers not primarily focused on existential risk reduction, potentially introducing viewpoints that dilute the field’s epistemic rigor.67 While organizers maintain high selection pressure to prioritize x-risk-motivated scholars, they recognize this tension between growth and field quality as they plan broader advertising.
Mentorship Dependency and Deference
Section titled “Mentorship Dependency and Deference”Critics note that scholars might overly defer to mentors, failing to critically analyze assumptions and reducing independent thinking or new viewpoints in the field.68 This concern exists in tension with the opposite problem: insufficient mentorship could lead to excessive peer reliance among inexperienced researchers. MATS rarely accepts scholars without mentors, viewing mentorship as essential for knowledge transfer, which limits scalability and raises barriers since mentors have high entry requirements and capacity constraints.69
Opportunity Costs for Participants
Section titled “Opportunity Costs for Participants”Alumni impact analysis reveals mostly positive views but highlights specific challenges:70
- Time allocation: Non-research tasks like writing proposals and preparing talks divert effort from core research
- Career uncertainty: One alumnus noted MATS pushed them into technical research with less than 70% confidence it was positive; another preferred their prior ML engineering role for deeper technical challenges
- Relationship strain: Some scholars reported impacts on prior commitments, such as strained relationships with PhD supervisors when pausing unrelated work
- Emotional fit: Some felt out of place in the AI safety community or experienced slowed involvement
- Grant stress: Short-term funding uncertainty led some to doubt their counterfactual impact when applying to AI safety roles
Selection Challenges
Section titled “Selection Challenges”With approximately 15% acceptance rates and 40+ mentors conducting independent selection, even proficient researchers and engineers with AI safety experience frequently receive rejections due to mentor capacity limits rather than candidate quality.71 Application processes involve mentor-specific interviews on ML experience, research proposals, conceptual questions, and experiments, with rejections common even after strong interviews.
Alumni feedback indicates that scholars with prior research experience often rate MATS superior to alternatives like independent research or “hub-hopping,” though some note they would have preferred later participation after building more ML skills through programs like ARENA.72
Key Uncertainties
Section titled “Key Uncertainties”- Scalability: Can MATS maintain research quality while expanding beyond current mentor capacity constraints, given the program’s emphasis on apprenticeship-style learning?
- Counterfactual Impact: What proportion of alumni would have entered AI safety careers through alternative pathways, and how much does MATS accelerate versus redirect talent?
- Optimal Program Length: Is the 12-week duration optimal for research skill development, or would longer or shorter programs better serve different scholar populations?
- Field Dilution Risks: As MATS expands and advertises more broadly, how can the program maintain epistemic standards while increasing accessibility?
- Extension Selection: With ~70% of scholars historically advancing to extensions, what criteria best predict long-term research impact?
- Mentor-Scholar Matching: How can the program optimize matching between mentors and scholars to balance deference concerns against knowledge transfer benefits?