Back
More capable models scheme at higher rates
webCredibility Rating
4/5
High(4)High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: Apollo Research
Data Status
Not fetched
Cited by 11 pages
| Page | Type | Quality |
|---|---|---|
| Large Language Models | Concept | 62.0 |
| Situational Awareness | Capability | 67.0 |
| Alignment Evaluations | Approach | 65.0 |
| Capability Elicitation | Approach | 91.0 |
| Dangerous Capability Evaluations | Approach | 64.0 |
| Third-Party Model Auditing | Approach | 64.0 |
| AI Safety Cases | Approach | 91.0 |
| Scheming & Deception Detection | Approach | 91.0 |
| Sleeper Agent Detection | Approach | 66.0 |
| AI Capability Sandbagging | Risk | 67.0 |
| Treacherous Turn | Risk | 67.0 |
Resource ID:
80c6d6eca17dc925 | Stable ID: NzhhNTVlZD