Epistemic Virtue Evals
A proposed suite of open benchmarks evaluating AI models on epistemic virtues: calibration, clarity, bias resistance, sycophancy avoidance, and manipulation detection. Includes the concept of 'pedantic mode' for maximally accurate AI outputs.
Related Pages
Top Related Pages
Design Sketches for Collective Epistemics
Forethought Foundation's five proposed technologies for improving collective epistemics: community notes for everything, rhetoric highlighting, rel...
Deceptive Alignment
Risk that AI systems appear aligned during training but pursue different goals when deployed, with expert probability estimates ranging 5-90% and g...
Sycophancy
AI systems trained to seek user approval may systematically agree with users rather than providing accurate information—an observable failure mode ...
AI Content Provenance Tracing
A proposed epistemic infrastructure making knowledge provenance transparent and traversable—enabling anyone to see the chain of citations, original...
epistemic-tools-approaches-overview
Categories and methodologies for improving collective epistemics, from prediction markets to deliberation platforms.