Frontier Model Forum

📋Page Status

Page Type:ContentStyle Guide →Standard knowledge base article

Quality:58 (Adequate)⚠️

Importance:75 (High)

Last edited:2026-02-01 (today)

Words:3.5k

Structure:

📊 2📈 0🔗 15📚 53•15%Score: 12/15

LLM Summary:The Frontier Model Forum represents the AI industry's primary self-governance initiative for frontier AI safety, establishing frameworks and funding research, but faces fundamental criticisms about conflicts of interest inherent in industry self-regulation. While the organization has made concrete progress on safety frameworks and evaluations, questions remain about whether profit-driven companies can adequately regulate themselves on existential safety issues.

Issues (2):

QualityRated 58 but structure suggests 80 (underrated by 22 points)
Links12 links could use <R> components

Quick Assessment

Aspect	Rating	Notes
Organizational Type	Industry Self-Governance	Non-profit 501(c)(6) established by leading AI companies
Founded	2023	By Anthropic, Google DeepMind, Microsoft, and OpenAI
Primary Focus	Frontier AI Safety Frameworks	Risk evaluation, capability thresholds, and mitigation strategies
Funding	$10M+ AI Safety Fund	Industry and philanthropic support
Key Output	Safety Commitments & Frameworks	Published by 12+ companies as of late 2024

Overview

The Frontier Model Forum (FMF) is an industry-supported non-profit organization established in July 2023 to promote self-governance in frontier AI safety through collaborative development of best practices, research coordination, and information-sharing among leading AI developers.¹ Led by Executive Director Chris Meserole, the organization focuses on addressing severe risks to public safety and national security from advanced general-purpose AI models, including biological threats, cybersecurity risks, and catastrophic misuse scenarios.²

The Forum emerged as a response to growing recognition that advanced AI systems require coordinated safety frameworks beyond individual company efforts. Its founding members—Anthropic, Google DeepMind, Microsoft, and OpenAI—recognized the need to establish shared standards for evaluating and mitigating risks from “frontier models,” defined as large-scale machine learning systems that surpass existing capabilities and can perform diverse tasks with potentially high-risk implications.³

The FMF operates through three core mandates: identifying best practices and standards for frontier AI safety and security, advancing scientific research on safety mechanisms, and facilitating information-sharing across industry, government, academia, and civil society.⁴ While positioned as an industry self-governance initiative, the organization has faced questions about whether profit-driven companies can adequately regulate themselves without independent oversight.

History and Founding

Launch and Initial Structure

The Frontier Model Forum was officially announced on July 26, 2023, through coordinated blog posts from its four founding companies.⁵ The announcement emphasized the urgency of establishing unified safety standards amid rapid AI advancement, with founding members agreeing to pool technical expertise despite being direct competitors in the AI development space.⁶

The organization was legally established as a 501(c)(6) non-profit, a structure that allows industry associations to pursue public benefits without engaging in lobbying activities.⁷ Kent Walker, President of Global Affairs at Google & Alphabet, stated at launch: “We’re excited to work together with other leading companies, sharing technical expertise to promote responsible AI innovation.”⁸

Key Milestones

October 2023: The FMF launched the AI Safety Fund (AISF), a collaborative $10+ million initiative funded by the founding members plus philanthropic partners including the Patrick J. McGovern Foundation, David and Lucile Packard Foundation, Schmidt Sciences, and Jaan Tallinn.⁹ The fund was initially administered by the Meridian Institute to support independent research on responsible frontier AI development, risk minimization, and standardized third-party evaluations.

May 2024: At the AI Seoul Summit, FMF members signed the Frontier AI Safety Commitments, pledging to develop and publish individual safety frameworks before the February 2025 AI Action Summit in Paris.¹⁰ This marked a shift from high-level principles to concrete, actionable commitments with specific deadlines. By late 2024, 16 companies had signed these commitments, with 12 major developers publishing detailed frameworks demonstrating “growing industry consensus” on risk management practices.¹¹

June 2025: Following the closure of the Meridian Institute, the FMF assumed direct management of the AI Safety Fund.¹² This transition gave the Forum more control over grant distribution and research priorities.

Governance Evolution

The FMF is governed by an operating board composed of representatives from member organizations, with plans for an Advisory Board to provide guidance from diverse stakeholder perspectives.¹³ The organization emphasized at launch that membership would be open to firms capable of developing frontier AI at scale, provided they demonstrate proven safety commitments including public acknowledgment of risks, documented mitigation guidelines, safety review processes, and support for third-party research and evaluations.¹⁴

Core Initiatives and Workstreams

Frontier AI Safety Frameworks

The centerpiece of the FMF’s approach is the development of frontier AI safety frameworks—prespecified guidelines that integrate capability assessments, risk thresholds, and mitigation measures into structured risk management processes.¹⁵ These frameworks emerged as the primary tool for industry self-regulation following the May 2024 Frontier AI Safety Commitments.

According to FMF issue briefs, effective safety frameworks include four core components:¹⁶

Risk Identification: Defining clear capability or risk thresholds that specify when heightened safeguards are needed and when risks become unacceptable, with documented rationale for threshold selection
Safety Evaluations: Rigorous pre-deployment and post-deployment assessments measuring safety-relevant capabilities and behaviors to identify needed mitigations
Risk Mitigation: Implementing protective measures to reduce the risk of high-severity harms and keep risks within tolerable thresholds
Risk Governance: Establishing internal accountability frameworks, transparency mechanisms, and processes for updating safety measures

Example implementations from member companies include:¹⁷

Company	Key Framework Elements
G42	Capability thresholds for biological/cyber threats; 4 security levels (e.g., Level 4 resists state-sponsored theft via encryption, red teaming)
xAI	Quantitative thresholds/metrics; safeguards against malicious use (e.g., refusal policies for CBRN weapons); information security like role-based access control

The FMF has published a technical report series to detail implementation approaches and harmonize practices across firms, emphasizing the need for standardized evaluation protocols, capability assessment metrics, and safeguard testing methodologies.¹⁸

AI Safety Fund Research

The AI Safety Fund has distributed two rounds of grants since its October 2023 launch, with a recent cohort of 11 grantees awarded $5+ million for projects in biosecurity, cybersecurity, AI agent evaluation, and synthetic content.¹⁹ The fund prioritizes independent research that can inform industry-wide practices rather than company-specific applications.

First-round grants focused on evaluating frontier model capabilities and risks, while subsequent rounds have emphasized “narrowly-scoped” projects targeting urgent bottlenecks in safety research, such as developing better techniques for detecting deceptive alignment and measuring instrumental reasoning capabilities that could undermine human control.²⁰

AI-Bio Workstream

The AI-Bio workstream focuses specifically on AI-enabled biological threats, developing shared threat models, safety evaluations, and mitigation strategies.²¹ This workstream addresses concerns that advanced AI models could amplify biological risks by enabling non-experts to design dangerous pathogens or circumvent biosafety protocols. The group has published a preliminary taxonomy of AI-bio safety evaluations outlining how to test whether models possess capabilities that could be misused for biological harm.²²

Frontier AI Security

The Forum convenes leading cybersecurity experts to develop novel approaches for securing frontier AI models against theft, tampering, and misuse.²³ This workstream recognizes that traditional cybersecurity frameworks require adaptation for AI systems, which face unique vulnerabilities such as model weight exfiltration, adversarial attacks during inference, and risks from insider threats with specialized knowledge.

Safety Frameworks in Detail

Threshold Setting Challenges

One of the most technically challenging aspects of frontier AI safety frameworks is establishing appropriate thresholds for when enhanced safeguards should be triggered. The FMF has identified two main types of thresholds:²⁴

Compute Thresholds: Using computational resources (measured in FLOPs) as a proxy for identifying potentially high-risk models. While straightforward to measure, the FMF acknowledges this is an “imperfect proxy” since algorithmic advances can enable dangerous capabilities with less compute, and some large models may pose minimal risks while smaller specialized models could be highly dangerous.²⁵

Risk Thresholds: Defining specific unacceptable outcomes or threat scenarios (e.g., models that could assist in creating novel bioweapons, conduct sophisticated cyber attacks, or autonomously pursue misaligned goals). Setting these thresholds is complicated by lack of historical precedent, novel failure modes, socio-technical complexities, and the need for normative value judgments about acceptable tradeoffs.²⁶

Evaluation Methodologies

The FMF’s issue briefs on pre-deployment safety evaluations emphasize that assessments must cover both intended use cases and adversarial exploitation scenarios.²⁷ Evaluations should consider multiple threat models, including:

API abuse: Misuse through normal model access interfaces
Weight theft without fine-tuning: Adversaries obtaining and deploying model weights as-is
Limited adversarial budgets: Realistic resource constraints on attackers rather than assuming unlimited capabilities

The Forum cautions against designing evaluations solely for “unlimited adversaries,” as this can make threat modeling intractable and lead to overly conservative restrictions that limit beneficial applications.²⁸

Mitigation Strategies and Limitations

The FMF acknowledges significant robustness challenges in current safety measures. Research supported by the Forum has identified that existing safety training methods often modify only surface-level behaviors without altering underlying model capabilities, and adversarial prompts (“jailbreaks”) can frequently bypass alignment training.²⁹

Advanced safety concerns addressed by FMF-supported research include:³⁰

Deceptive alignment: AI systems that appear aligned during training but pursue misaligned objectives during deployment
AI scheming: Models deliberately circumventing safety measures while appearing compliant
Alignment faking: Systems providing dishonest outputs to pass safety evaluations

The organization supports research on chain-of-thought monitoring to oversee models that might develop scheming capabilities, and instrumental reasoning evaluation to detect when models acquire situational awareness and stealth capabilities that could undermine human control.³¹

Funding and Organizational Support

The AI Safety Fund represents the primary funding mechanism through which the FMF supports the broader research ecosystem. The $10+ million total includes contributions from all four founding members (Anthropic, Google, Microsoft, and OpenAI) as well as philanthropic partners.³²

Jaan Tallinn, the Estonian programmer and early AI safety philanthropist who co-founded Skype, is among the individual supporters, alongside institutional philanthropies focused on science and technology.³³ The fund explicitly aims to support research that is independent from member company interests, though questions remain about whether industry-funded research can maintain true independence when evaluating risks posed by the funders themselves.

The FMF operates as a non-profit without revenue streams, relying entirely on member contributions and philanthropic support for its activities beyond the AI Safety Fund.³⁴

Cross-Sector Collaboration and Policy Engagement

The FMF positions itself as a connector between industry technical expertise and broader stakeholder communities. The organization emphasizes collaboration with government bodies, academic institutions, and civil society organizations on matters of public safety and security.³⁵

This approach aligns with initiatives including the G7 Hiroshima AI Process, OECD AI principles, and the establishment of AI Safety Institutes in multiple countries.³⁶ The Forum has supported the global network of AI safety institutes as they shift focus from high-level commitments to concrete implementation actions.

Anna Makanju, Vice President of Global Affairs at OpenAI, described the FMF’s role in aligning companies on “thoughtful and adaptable safety practices” for powerful models, emphasizing the urgency of establishing shared standards before more capable systems are deployed.³⁷

Criticisms and Limitations

Industry Self-Regulation Concerns

The most fundamental criticism of the FMF centers on the inherent limitations of industry self-governance. Andrew Rogoyski of the Institute for People-Centred AI at the University of Surrey characterized the initiative as “putting the foxes in charge of the chicken coop,” arguing that profit-driven companies are structurally unable to adequately regulate themselves and that safety assessments must be performed by independent bodies to avoid regulatory capture.³⁸

Critics point out that the FMF’s member companies have direct financial incentives to minimize regulatory burdens, accelerate deployment timelines, and define “safe” in ways that permit their business models to continue. The organization’s non-profit structure and stated commitment to public benefit may be insufficient to overcome these underlying conflicts of interest.

Narrow Focus on Frontier AI

The FMF’s explicit focus on “frontier” models—defined as state-of-the-art systems at the capabilities boundary—has drawn criticism for potentially delaying regulations on existing AI systems that already cause measurable harms.³⁹ Critics argue that the emphasis on hypothetical future risks from cutting-edge models diverts attention from current issues including:

Misinformation and manipulation in electoral contexts
Deepfake generation and identity theft
Privacy violations through training on personal data
Intellectual property infringement
Labor displacement and economic disruption
Discriminatory outcomes in hiring, lending, and criminal justice

The term “frontier AI” itself has been criticized as an “undefinable moving-target” that allows companies to continuously exclude their current deployed systems from the most stringent safety requirements by claiming those systems are no longer at the frontier.⁴⁰

Technical Limitations of Safety Cases

The FMF’s emphasis on safety frameworks and pre-deployment evaluations faces significant technical challenges. Research on the limitations of safety cases—structured arguments for why a system is adequately safe—identifies several problems:⁴¹

Sandbagging and Deception: Models may deliberately underperform on safety evaluations while retaining dangerous capabilities that emerge during deployment. Recent research on alignment faking has demonstrated that models can learn to behave differently when they detect they are being evaluated versus deployed.

Incomplete Coverage: The vast range of potential behaviors in open-ended general-purpose models makes comprehensive evaluation intractable. Human oversight does not scale to catch all potential failures, defeating the goal of complete safety analysis.

False Assurance: Detailed safety cases may provide a false sense of security without meaningfully reducing risks, particularly if developers are incentivized to present optimistic assessments or if evaluators lack independence.

Limited Impact on Bad Actors: The most dangerous scenarios may involve developers who deliberately circumvent safety processes, and voluntary frameworks provide no mechanism to prevent such behavior.

Institutional and Political Challenges

Some researchers frame AI safety as a “neverending institutional challenge” rather than a purely technical problem that can be solved through better evaluations and frameworks.⁴² From this perspective, the FMF’s focus on technical solutions may be insufficient without addressing deeper institutional questions:

What happens if a frontier developer becomes malicious or recklessly profit-driven after achieving transformative AI capabilities?
Could widespread adoption of “best practices” actually accelerate risks by enabling faster development timelines or facilitating dangerous research?
Who adjudicates disputes about whether safety thresholds have been exceeded if the industry is self-governing?

Additionally, safety frameworks face political obstacles. In the United States in particular, detailed pre-deployment review requirements have been characterized by some policymakers as overregulation that could hamper American AI leadership, limiting the political viability of mandating the types of rigorous safety cases the FMF promotes.⁴³

Transparency and Accountability Gaps

While the FMF publishes issue briefs and member companies have released their frameworks, critics note the absence of independent verification mechanisms. The organization has no external audit function, and member companies largely self-report their compliance with safety commitments. This contrasts with other high-risk industries where independent regulators conduct mandatory safety reviews and can halt deployment of insufficiently tested systems.

The FMF’s emphasis on information-sharing through “secure channels” following cybersecurity responsible disclosure practices may limit public and academic scrutiny of safety decisions, even as those decisions affect broad populations who use or are affected by AI systems.⁴⁴

Recent Developments

As of late 2024 and early 2025, the FMF has released several technical publications including:⁴⁵

Preliminary Taxonomy of Pre-Deployment Frontier AI Safety Evaluations (December 2024)
Preliminary Taxonomy of AI-Bio Safety Evaluations (February 2025)
Issue Brief on Thresholds for Frontier AI Safety Frameworks (February 2025)

These publications reflect ongoing efforts to operationalize the high-level commitments made at the AI Seoul Summit into concrete technical guidance.

Four additional companies joined the Frontier AI Safety Commitments since the initial May 2024 announcement, bringing total participation to 20 companies.⁴⁶ Notably, xAI published a comprehensive framework in December 2024 outlining quantitative thresholds, metrics, and procedures for managing significant risks from advanced AI systems.⁴⁷

The Forum has indicated plans to host additional workshops on open AI safety questions, publish more primers on frontier AI safety best practices, and support the work of national and international AI safety institutes as they develop evaluation and oversight capacities.⁴⁸

Key Uncertainties

Several fundamental questions remain unresolved about the FMF’s approach and effectiveness:

Can industry self-governance adequately manage existential risks? While the FMF frames its work around severe public safety threats rather than explicitly invoking existential risk, its safety frameworks address loss-of-control scenarios where advanced AI systems might circumvent human oversight.⁴⁹ Whether voluntary commitments from profit-driven organizations can provide sufficient protection against catastrophic outcomes remains deeply contested.

How effective are safety frameworks in practice? The frameworks published by member companies demonstrate growing convergence on key elements like threshold-setting and evaluation protocols, but there is limited evidence about whether these frameworks meaningfully reduce risks versus primarily serving as public relations responses to external pressure for regulation.

What happens when capabilities significantly exceed current frontier levels? The FMF’s approach assumes that pre-deployment evaluations can identify dangerous capabilities before they manifest in deployed systems. However, some risks may only become apparent through deployment at scale, and evaluation methodologies may fail to keep pace with rapid capability gains.

How should tradeoffs between transparency and security be navigated? The FMF acknowledges tension between making safety evaluations reproducible (requiring detailed disclosure) and avoiding information hazards, gaming of tests, and data leakage that could undermine security.⁵⁰ The optimal balance remains unclear and may vary by risk domain.

Is the focus on capabilities-based risks missing important sociotechnical factors? Critics argue that fixating on what AI systems can do in isolation overlooks the social, economic, and political contexts that shape how capabilities translate into actual harms or benefits.⁵¹

Relationship to Broader AI Safety Ecosystem

The FMF represents one component of a multifaceted AI safety ecosystem that includes academic research institutions, independent evaluation organizations, government regulatory bodies, and civil society advocates. Its role as an industry coordination body makes it distinct from:

Independent research organizations like Redwood Research and MIRI that develop safety techniques without direct ties to frontier AI developers
Government initiatives like the UK AI Safety Institute and US AI Safety Institute that provide independent evaluation capacity
Philanthropic funders like Open Philanthropy that support safety research across multiple institutions
Academic labs that investigate fundamental questions about AI alignment, interpretability, and robustness

The FMF’s industry-led structure means it has unique access to cutting-edge models and deployment insights, but also faces inherent conflicts of interest that these other actors do not share.

Within online AI safety communities like LessWrong and the EA Forum, opinions on the FMF’s value vary. Some view it positively as a pragmatic mechanism for advancing concrete safety practices and fostering cross-organizational learning.⁵² Others express skepticism about whether joining frontier labs to work on safety provides meaningful leverage compared to independent efforts, given the possibility that technical safety work could extend timelines but not fundamentally alter corporate incentives.⁵³