67.5ImportanceUsefulImportance: 67.5/100How central this topic is to AI safety. Higher scores mean greater relevance to understanding or mitigating AI risk.29.5ResearchMinimalResearch Value: 29.5/100How much value deeper investigation of this topic could yield. Higher scores indicate under-explored topics with high insight potential.
Content1/13
LLM summaryLLM summaryBasic text summary used in search results, entity link tooltips, info boxes, and related page cards.crux content improve <id>ScheduleScheduleHow often the page should be refreshed. Drives the overdue tracking system.Set updateFrequency in frontmatterEntityEntityYAML entity definition with type, description, and related entries.Edit historyEdit historyTracked changes from improve pipeline runs and manual edits.crux edit-log view <id>OverviewOverviewA ## Overview heading section that orients readers. Helps with search and AI summaries.Add a ## Overview section at the top of the page
Tables0/ ~1TablesData tables for structured comparisons and reference material.Add data tables to the pageDiagrams0DiagramsVisual content — Mermaid diagrams, charts, or Squiggle estimate models.Add Mermaid diagrams or Squiggle modelsInt. links0/ ~3Int. linksLinks to other wiki pages. More internal links = better graph connectivity.Add links to other wiki pagesExt. links0/ ~1Ext. linksLinks to external websites, papers, and resources outside the wiki.Add links to external sourcesFootnotes0/ ~2FootnotesFootnote citations [^N] with source references at the bottom of the page.Add [^N] footnote citationsReferences0/ ~1ReferencesCurated external resources linked via <R> components or cited_by in YAML.Add <R> resource linksQuotes0QuotesSupporting quotes extracted from cited sources to back up page claims.crux citations extract-quotes <id>Accuracy0AccuracyCitations verified against their sources for factual accuracy.crux citations verify <id>Backlinks2BacklinksNumber of other wiki pages that link to this page. Higher backlink count means better integration into the knowledge graph.
Issues1
StructureNo tables or diagrams - consider adding visual content
Prosaic Alignment
Safety Agenda
Prosaic Alignment
Aligning AI systems using current deep learning techniques without fundamental new paradigms
Epistemic SycophancyRiskEpistemic SycophancyAI sycophancy—where models agree with users rather than provide accurate information—affects all five state-of-the-art models tested, with medical AI showing 100% compliance with illogical requests...Quality: 60/100
Analysis
Alignment Robustness Trajectory ModelAnalysisAlignment Robustness Trajectory ModelThis model estimates alignment robustness degrades from 50-65% at GPT-4 level to 15-30% at 100x capability, with a critical 'alignment valley' at 10-30x where systems are dangerous but can't help s...Quality: 64/100
Key Debates
AI Alignment Research AgendasCruxAI Alignment Research AgendasComprehensive comparison of major AI safety research agendas (\$100M+ Anthropic, \$50M+ DeepMind, \$5-10M nonprofits) with detailed funding, team sizes, and failure mode coverage (25-65% per agenda...Quality: 69/100Why Alignment Might Be EasyArgumentWhy Alignment Might Be EasySynthesizes empirical evidence that alignment is tractable, citing 29-41% RLHF improvements, Constitutional AI reducing bias across 9 dimensions, millions of interpretable features from Claude 3, a...Quality: 53/100
Concepts
RLHFCapabilityRLHFRLHF/Constitutional AI achieves 82-85% preference improvements and 40.8% adversarial attack reduction for current systems, but faces fundamental scalability limits: weak-to-strong supervision shows...Quality: 63/100
Organizations
Safe Superintelligence Inc.OrganizationSafe Superintelligence Inc.Safe Superintelligence Inc represents a significant AI safety organization founded by key OpenAI alumni with \$3B funding and a singular focus on developing safe superintelligence, though its actua...Quality: 45/100Cambridge Boston Alignment InitiativeOrganizationCambridge Boston Alignment InitiativeRegional AI alignment research and community organization based in the Cambridge/Boston area.Alignment Research Engineer AcceleratorOrganizationAlignment Research Engineer AcceleratorTraining program that upskills software engineers to become alignment researchers.Apart ResearchOrganizationApart ResearchAI safety research organization running hackathons, fellowships, and collaborative research projects.Pivotal ResearchOrganizationPivotal ResearchResearch organization working on AI safety and alignment research.
Safety Research
AI Value LearningSafety AgendaAI Value LearningTraining AI systems to infer and adopt human values from observation and interaction0