Edited today1 backlinksUpdated monthlyDue in 4 weeks
20QualityDraftQuality: 20/100LLM-assigned rating of overall page quality, considering depth, accuracy, and completeness.19ImportancePeripheralImportance: 19/100How central this topic is to AI safety. Higher scores mean greater relevance to understanding or mitigating AI risk.18ResearchMinimalResearch Value: 18/100How much value deeper investigation of this topic could yield. Higher scores indicate under-explored topics with high insight potential.
Summary
An interactive sortable table summarizing which AI safety approaches are likely to generalize to future architectures. Shows generalization level, dependencies, and threats for each approach.
Content3/12
LLM summaryLLM summaryBasic text summary used in search results, entity link tooltips, info boxes, and related page cards.ScheduleScheduleHow often the page should be refreshed. Drives the overdue tracking system.EntityEntityYAML entity definition with type, description, and related entries.Add entity YAML in data/entities/Edit history1Edit historyTracked changes from improve pipeline runs and manual edits.
–Tables1/ ~2TablesData tables for structured comparisons and reference material.Add data tables to the pageDiagrams0DiagramsVisual content — Mermaid diagrams, charts, or Squiggle estimate models.Add Mermaid diagrams or Squiggle modelsInt. links0/ ~3Int. linksLinks to other wiki pages. More internal links = better graph connectivity.Add links to other wiki pagesExt. links0/ ~1Ext. linksLinks to external websites, papers, and resources outside the wiki.Add links to external sourcesFootnotes0/ ~1FootnotesFootnote citations [^N] with source references at the bottom of the page.Add [^N] footnote citationsReferences0/ ~1ReferencesCurated external resources linked via <R> components or cited_by in YAML.Add <R> resource linksQuotes0QuotesSupporting quotes extracted from cited sources to back up page claims.crux citations extract-quotes <id>Accuracy0AccuracyCitations verified against their sources for factual accuracy.crux citations verify <id>RatingsN:3 R:3 A:3 C:3RatingsSub-quality ratings: Novelty, Rigor, Actionability, Completeness (0-10 scale).Backlinks1BacklinksNumber of other wiki pages that link to this page. Higher backlink count means better integration into the knowledge graph.
Change History1
Remove legacy pageTemplate frontmatter3 weeks ago
Removed the legacy `pageTemplate` frontmatter field from 15 MDX files. This field was carried over from the Astro/Starlight era and is not used by the Next.js application.
opus-4-6 · ~10min
Safety Generalizability Table
Columns:|
Expected generalization to future AI architectures
Requires (to work)
Threatened by
Mechanistic Interpretability
Circuit-level understanding of model internals. High value if it works, but highly dependent on architecture stability and access.
Circuits, probing, activation patching
LOW
✓White-box access available?
✓Representations converge?
✓Architecture stable enough?
✗Heavy scaffolding?
✗Novel architecture emerges?
Training-Based Alignment
Shaping model behavior through training signals (RLHF, Constitutional AI, debate). Requires training access but somewhat architecture-agnostic.