Tool-Use Restrictions
Tool-use restrictions limit what actions and APIs AI systems can access, directly constraining their potential for harm. This approach is critical for agentic AI systems, providing hard limits on capabilities regardless of model intentions, with METR evaluations showing agentic task completion horizons doubling every 7 months.
Related
Related Pages
Top Related Pages
Sandboxing / Containment
Sandboxing limits AI system access to resources, networks, and capabilities as a defense-in-depth measure.
Anthropic
An AI safety company founded by former OpenAI researchers that develops frontier AI models while pursuing safety research, including the Claude mod...
Agentic AI
AI systems that autonomously take actions in the world to accomplish goals. Industry forecasts project 40% of enterprise applications will include ...
METR
Model Evaluation and Threat Research conducts dangerous capability evaluations for frontier AI models, testing for autonomous replication, cybersec...
OpenAI
Leading AI lab that developed GPT models and ChatGPT, analyzing organizational evolution from non-profit research to commercial AGI development ami...