Nonprofit designing and testing democratic processes for AI governance and alignment. Led by Aviv Ovadya. Key projects include the Democracy Levels framework (evaluating how democratically decisions are made), Case Law for AI (judicial-inspired approaches to connecting deliberative input into alignment), and Safeguarded AI (ARIA partnership for formal safety specifications through deliberation). 11 staff members.
AI AlignmentApproachAI AlignmentComprehensive review of AI alignment approaches finding current methods (RLHF, Constitutional AI) show 75%+ effectiveness on measurable safety metrics for existing systems but face critical scalabi...Quality: 91/100
Analysis
Alignment Robustness Trajectory ModelAnalysisAlignment Robustness Trajectory ModelThis model estimates alignment robustness degrades from 50-65% at GPT-4 level to 15-30% at 100x capability, with a critical 'alignment valley' at 10-30x where systems are dangerous but can't help s...Quality: 64/100
Other
Iason GabrielPersonIason GabrielResearch Scientist (Ethics) at Google DeepMind. Author of "Artificial Intelligence, Values, and Alignment" (Minds and Machines, 2020). Contributed to the Democratic AI experiment (Koster, Balaguer ...Divya SiddarthPersonDivya SiddarthCo-founder and Executive Director of the Collective Intelligence Project (CIP). Leads development of Alignment Assemblies and Global Dialogues bringing thousands of voices worldwide to AI governanc...Value LearningResearch AreaValue LearningTraining AI systems to infer and adopt human values from observation and interactionQuality: 59/100RLHFResearch AreaRLHFRLHF/Constitutional AI achieves 82-85% preference improvements and 40.8% adversarial attack reduction for current systems, but faces fundamental scalability limits: weak-to-strong supervision shows...Quality: 63/100
Key Debates
Why Alignment Might Be HardArgumentWhy Alignment Might Be HardA comprehensive taxonomy of alignment difficulty arguments spanning specification problems, inner alignment failures, verification limits, and adversarial dynamics, with expert p(doom) estimates ra...Quality: 69/100AI Alignment Research AgendasCruxAI Alignment Research AgendasComprehensive comparison of major AI safety research agendas ($100M+ Anthropic, $50M+ DeepMind, $5-10M nonprofits) with detailed funding, team sizes, and failure mode coverage (25-65% per agenda). ...Quality: 69/100
Risks
Epistemic SycophancyRiskEpistemic SycophancyAI sycophancy—where models agree with users rather than provide accurate information—affects all five state-of-the-art models tested, with medical AI showing 100% compliance with illogical requests...Quality: 60/100