Longterm Wiki
Back

Anthropic's sabotage evaluations

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Anthropic

Data Status

Not fetched

Cited by 1 page

PageTypeQuality
AI Capability SandbaggingRisk67.0
Resource ID: 9d653677d03c2df3 | Stable ID: MDQ0MDEwOT