Skip to content
Longterm Wiki

Interpretability

TeamActive
Anthropic·2021-01present·Wiki page →

The Interpretability team's mission is to discover and understand how large language models work internally, as a foundation for AI safety and positive outcomes. Led by Chris Olah.

Other Anthropic Divisions

7
Interpretability | Anthropic | Divisions | Longterm Wiki