Index
Citation·page:anthropic:fn36
Anthropic - Footnote 36
Verdictpartial90%
1 check · 4/3/2026The source does not explicitly state that Anthropic described this as the 'first empirical example' of alignment faking without training. It only mentions that the phenomenon wasn't explicitly programmed into the models. The source does not contain the critics' argument that the behaviors themselves indicate unresolved alignment challenges.
Our claim
entire recordNo record data available.
Source evidence
1 src · 1 checkpartial90%Haiku 4.5 · 4/3/2026
NoteThe source does not explicitly state that Anthropic described this as the 'first empirical example' of alignment faking without training. It only mentions that the phenomenon wasn't explicitly programmed into the models. The source does not contain the critics' argument that the behaviors themselves indicate unresolved alignment challenges.
Case № page:anthropic:fn36Filed 4/3/2026Confidence 90%