67ImportanceUsefulImportance: 67/100How central this topic is to AI safety. Higher scores mean greater relevance to understanding or mitigating AI risk.41ResearchLowResearch Value: 41/100How much value deeper investigation of this topic could yield. Higher scores indicate under-explored topics with high insight potential.
Content1/13
LLM summaryLLM summaryBasic text summary used in search results, entity link tooltips, info boxes, and related page cards.crux content improve <id>ScheduleScheduleHow often the page should be refreshed. Drives the overdue tracking system.Set updateFrequency in frontmatterEntityEntityYAML entity definition with type, description, and related entries.Edit historyEdit historyTracked changes from improve pipeline runs and manual edits.crux edit-log view <id>OverviewOverviewA ## Overview heading section that orients readers. Helps with search and AI summaries.Add a ## Overview section at the top of the page
Tables0/ ~1TablesData tables for structured comparisons and reference material.Add data tables to the pageDiagrams0DiagramsVisual content — Mermaid diagrams, charts, or Squiggle estimate models.Add Mermaid diagrams or Squiggle modelsInt. links0/ ~3Int. linksLinks to other wiki pages. More internal links = better graph connectivity.Add links to other wiki pagesExt. links0/ ~1Ext. linksLinks to external websites, papers, and resources outside the wiki.Add links to external sourcesFootnotes0/ ~2FootnotesFootnote citations [^N] with source references at the bottom of the page.Add [^N] footnote citationsReferences0/ ~1ReferencesCurated external resources linked via <R> components or cited_by in YAML.Add <R> resource linksQuotes0QuotesSupporting quotes extracted from cited sources to back up page claims.crux citations extract-quotes <id>Accuracy0AccuracyCitations verified against their sources for factual accuracy.crux citations verify <id>Backlinks3BacklinksNumber of other wiki pages that link to this page. Higher backlink count means better integration into the knowledge graph.
Issues1
StructureNo tables or diagrams - consider adding visual content
Adversarial Robustness
Concept
Adversarial Robustness
Testing and improving AI systems' resilience to adversarial inputs and attacks
Alignment Robustness Trajectory ModelAnalysisAlignment Robustness Trajectory ModelThis model estimates alignment robustness degrades from 50-65% at GPT-4 level to 15-30% at 100x capability, with a critical 'alignment valley' at 10-30x where systems are dangerous but can't help s...Quality: 64/100Safety-Capability Tradeoff ModelAnalysisSafety-Capability Tradeoff ModelAnalyzes when AI safety measures conflict with capabilities, finding most interventions impose 5-15% capability cost but RLHF actually improves usability +10-30%. Under strong racing dynamics (60-7...Quality: 64/100AI Acceleration Tradeoff ModelAnalysisAI Acceleration Tradeoff ModelQuantitative framework for evaluating how changes to AI development speed affect existential risk and long-term value. Models the marginal impact of acceleration/deceleration on P(existential catas...Quality: 50/100
Key Debates
Technical AI Safety ResearchCruxTechnical AI Safety ResearchTechnical AI safety research encompasses six major agendas (mechanistic interpretability, scalable oversight, AI control, evaluations, agent foundations, and robustness) with 500+ researchers and \...Quality: 66/100
Organizations
RAND CorporationOrganizationRAND CorporationNonprofit global policy think tank. Active in AI policy, security studies, and technology assessment.Center for a New American SecurityOrganizationCenter for a New American SecurityBipartisan national security and defense policy think tank.
Concepts
Capability EvaluationsConceptCapability EvaluationsSystematic assessment of AI systems' abilities, especially dangerous capabilities like deception, manipulation, or autonomous operation.AI Content ModerationConceptAI Content ModerationFiltering and managing AI-generated or AI-mediated content0