Skip to content
Longterm Wiki
Back

ML Safety — Center for AI Safety Research Hub

web
mlsafety.org·mlsafety.org/

The official hub of the Center for AI Safety's ML Safety initiative; useful as an entry point for researchers new to the field or seeking structured resources, courses, and community connections.

Metadata

Importance: 62/100homepage

Summary

MLSafety.org is the homepage for the ML Safety research community, a project of the Center for AI Safety (CAIS), organizing resources, education, courses, and competitions focused on reducing risks from AI systems. It frames ML safety across four pillars: Robustness, Monitoring, Alignment, and Systemic Safety. The site serves as a hub for researchers and non-technical audiences seeking to engage with AI safety work.

Key Points

  • Defines four core ML safety research areas: Robustness, Monitoring, Alignment, and Systemic Safety with concrete subtopics.
  • Hosts the ML Safety Course, newsletter, seminar series, and SafeBench competition for benchmark development.
  • Project of the Center for AI Safety (CAIS), connecting researchers via Slack, Twitter, and events like NeurIPS socials.
  • Covers technical topics including adversarial robustness, interpretability, value learning, power aversion, and cooperative AI.
  • Provides funding opportunities, reading resources, and community infrastructure for the AI safety research ecosystem.

Cited by 1 page

PageTypeQuality
Multipolar Trap Dynamics ModelAnalysis61.0

1 FactBase fact citing this source

Cached Content Preview

HTTP 200Fetched Apr 7, 20262 KB
ML Safety 

 
 
 
 
 SafeBench ML Safety

 The ML research community focused on
reducing risks from AI systems.

 What is ML Safety?

 ML systems are rapidly increasing in size, are acquiring new capabilities, and are increasingly deployed in high-stakes settings. As with other powerful technologies, the safety of ML systems should be a leading research priority. This involves ensuring systems can withstand hazards ( Robustness ), identifying hazards ( Monitoring ), reducing inherent ML system hazards ( Alignment ), and reducing systemic hazards ( Systemic Safety ). Example problems and subtopics in these categories are listed below:

 
 
 Robustness

 Adversarial Robustness, Long-Tail Robustness

 
 
 Monitoring

 Anomaly Detection, Interpretable Uncertainty, Transparency, Trojans, Detecting Emergent Behavior

 
 
 Alignment

 Honesty, Power Aversion, Value Learning, Machine Ethics

 
 
 Systemic Safety

 ML for Improved Epistemics, ML for Improved Cyberdefense, Cooperative AI

 Learn more ML Safety Projects

 We organize AI/ML safety resources and education for researchers and non-technical audiences.

 Seminar Series (Coming Soon) The Newsletter 
 
 NeurIPS 2023 Social 
 
 Competitions and Prizes ML Safety Course 
 
 Get Connected

 Stay in the loop and exchange thoughts and news related to ML safety. Join our slack or follow one of the accounts below.

 Follow ML Safety @ml_safety General Announcements Follow ML Safety Daily @topofmlsafety ML safety papers as they are released 
 
 A project by the Center for AI Safety 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 MLSafety Newsletter Funding Resources Course SafeBench SafeBench Overview Winners Example Ideas Guidelines Frequently Asked Questions Contact Terms and Conditions Events Events Overview NeurIPS 2024 NeurIPS 2023 MLSS Yale ICML Social Intro to MLS © 2026 Center for AI Safety Built by Osborn Design Works
Resource ID: 48fda4293ccad420 | Stable ID: sid_FMM7tePQvq