Open Source AI Safety
Analysis of whether releasing AI model weights publicly is net positive or negative for safety. The July 2024 NTIA report recommends monitoring but not restricting open weights, while research shows fine-tuning can remove safety training in as few as 200 examples.
Related
Related Pages
Top Related Pages
OpenAI
Leading AI lab that developed GPT models and ChatGPT, analyzing organizational evolution from non-profit research to commercial AGI development ami...
EU AI Act
The world's first comprehensive AI regulation, adopting a risk-based approach to regulate foundation models and general-purpose AI systems
AI Proliferation
AI proliferation—the spread of capabilities from frontier labs to diverse actors—accelerated dramatically as the capability gap narrowed from 18 to...
AI Development Racing Dynamics
Competitive pressure driving AI development faster than safety can keep up, creating prisoner's dilemma situations where actors cut safety corners ...
Refusal Training
Refusal training teaches AI models to decline harmful requests rather than comply.