Division: Interpretability
All three key fields in the record are confirmed by the source: (1) name 'Interpretability' is explicitly listed as a research team; (2) type 'team' is confirmed by the source referring to it as 'the Interpretability team'; (3) status 'active' is confirmed by recent publications attributed to this team dated in 2026, demonstrating current activity. The source subject matches the claim subject exactly.
Our claim
entire record- Parent Org
- Anthropic
- Name
- Interpretability
- Division Type
- team
- Status
- active
- Start Date
- January 2021
- Notes
- The Interpretability team's mission is to discover and understand how large language models work internally, as a foundation for AI safety and positive outcomes. Led by Chris Olah.
Source evidence
2 src · 3 checksNoteQUA-650 retro-scan: The source is a research paper by Anthropic's Interpretability team, not about the division/organizational unit itself. Per QUA-648, a product or output of an organization (research paper) is a MISMATCH from the organization unit that produced it. The claim is about 'Interpretability' as a division/organizational entity, while the source is about research conducted by that team.
NoteThe source text does not explicitly mention or reference a division, team, or organizational unit called 'Mechanistic Interpretability' with the specified properties (name, type, status). While the paper is clearly about mechanistic interpretability research and is authored by members of 'the Anthropic interpretability team' (mentioned in the text: 'scaling sparse autoencoders has been a major priority of the Anthropic interpretability team'), the source does not provide structured information confirming that 'Mechanistic Interpretability' is an active division/team with those exact specifications. The paper discusses the research area and mentions the team informally, but does not present organizational metadata about a division named 'Mechanistic Interpretability' with type 'team' and status 'active'.
NoteAll three key fields in the record are confirmed by the source: (1) name 'Interpretability' is explicitly listed as a research team; (2) type 'team' is confirmed by the source referring to it as 'the Interpretability team'; (3) status 'active' is confirmed by recent publications attributed to this team dated in 2026, demonstrating current activity. The source subject matches the claim subject exactly.