Longterm Wiki
Updated 2026-03-13HistoryData
Page StatusDocumentation
Edited today2.2k words2 backlinksPoint-in-time
70QualityGood85ImportanceHigh80ResearchHigh
Summary

Technical architecture document analyzing the wiki's overlapping citation systems (remark-gfm footnotes, References component, CitationOverlay, ResourceLink, and KB fact references) and proposing a unified architecture. Includes data model, rendering design, implementation phases, and exemplar page analysis. The plan centers on keeping standard Markdown footnotes as the authoring format while auto-creating resource YAML entries and rendering a single unified bibliography. Includes a status section tracking what has been implemented (KB citations, reference preprocessor, migrate-cr tooling) versus what remains proposed.

Content2/13
LLM summaryScheduleEntityEdit historyOverview
Tables7/ ~9Diagrams1/ ~1Int. links1/ ~17Ext. links2/ ~11Footnotes0/ ~7References0/ ~7Quotes0Accuracy0RatingsN:6 R:8 A:9 C:8Backlinks2

Citation Architecture: Current State & Unified Proposal

This document analyzes the current citation/reference system in the Longterm Wiki and proposes a concrete plan to unify it. The goal: every wiki page has a single, beautiful bibliography with rich metadata, verification indicators, and inline hover cards --- built on standard Markdown footnotes with zero migration cost.

Tracking issues:


The Problem: Five Overlapping Systems

The wiki has five overlapping citation/reference mechanisms, the newest being KB fact references:

1. remark-gfm Footnotes (MDX authoring format)

Standard Markdown footnotes used in all ~625 pages:

Kalshi is the first federally regulated prediction market.[^4]

remark-gfm compiles these into:

  • Inline: <sup><a data-footnote-ref href="#user-content-fn-1">[1]</a></sup>
  • Bottom section: <section data-footnotes><ol><li id="user-content-fn-1">...</li></ol></section>

Coverage: Complete (every page has footnotes). Metadata: None (just a title + URL link).

2. References Component (resource YAML rendering)

Server component (apps/web/src/components/wiki/References.tsx) that renders from resource YAML entries in data/resources/*.yaml:

<References pageId="kalshi" />

Shows: title, author, date, credibility badge, publication name, peer-review status, expandable details with summary and verification dots.

Coverage: Partial --- only sources that have hand-written YAML entries. On the Kalshi page: 31 out of 87 citations have resource entries. Metadata: Rich (title, author, date, type, credibility, summary, tags, publication).

3. CitationOverlay (verification indicators)

Client component (apps/web/src/components/wiki/CitationOverlay.tsx) that uses DOM portals to inject colored dots onto footnote [N] refs:

  • Finds all <a data-footnote-ref> elements in the article
  • Looks up verification data from Postgres citation_quotes table
  • Renders hover cards with accuracy verdicts, supporting quotes, confidence scores

Coverage: Only citations that have been accuracy-checked. Metadata: Verification-only (no source metadata like author/date).

Component (apps/web/src/components/wiki/ResourceLink.tsx) designed as an alternative to footnotes:

<R id="resource-id" n={1}>Link text</R>

Shows a tooltip with resource metadata + credibility badge. Imported on the Kalshi page but never actually used. No page in the wiki uses this component in practice.

5. KB Fact References (structured data citations)

The newest citation type, added in early 2026. KB facts (packages/kb/data/things/*.yaml) are the canonical source for structured data (employee counts, funding amounts, founding dates, etc.). Pages can cite them via special footnote markers:

Anthropic has approximately 1,500 employees.[^2]

The reference preprocessor (apps/web/src/lib/reference-preprocessor.ts) handles three marker types at build time:

  • [^1] --- claim references (legacy, being migrated away via pnpm crux footnotes migrate-cr)
  • [^5] --- citation references (standard DB-backed citations)
  • [^kb-{factId}] --- KB fact references (links to structured facts with source URL, sourceResource, sourceQuote, and asOf date)

The preprocessor replaces these markers with numbered [^N] footnotes and appends auto-generated definitions, so remark-gfm processes them like any other footnote. KB fact footnotes include the fact value, date, and source link.

Coverage: Growing --- used on pages that display KB-sourced data. Metadata: Rich (value, unit, date, source URL, source quote, notes).

The Result

Readers see:

  1. A plain footnote section at the bottom (remark-gfm) with just title + URL links
  2. A separate "References" section below that (References component) with different numbering and richer metadata, but only for some sources
  3. Colored verification dots injected onto inline [N] refs (CitationOverlay) with no connection to either bibliography
  4. KB fact footnotes generated by the reference preprocessor, which appear as regular numbered footnotes but source their data from the KB YAML layer

Case Study: Kalshi Page

The Kalshi page is a good exemplar because it exercises the full system:

MetricValue
Total footnotes87
Unique source URLs≈35
Resource YAML entries31
Citation quotes (Postgres)87
Verified accurate84 (96.6%)
Broken citations3 (HTTP 403)
Quality score25/100
Hallucination riskMedium (50/100)

Key observations:

  • 87 footnotes but only ~35 unique URLs (many footnotes cite the same source)
  • 31 resource YAML entries, meaning ~4 URLs have no resource entry at all
  • The remark-gfm footnote section lists all 87 entries (with duplicates)
  • The References section lists the 31 matched resources (different ordering)
  • Citation quotes exist for all 87 footnotes but aren't connected to the resource metadata

The gap: The remark-gfm section shows everything but with no metadata. The References section shows rich metadata but only for a subset. The verification dots are on a third layer. Nothing is unified.


Proposed Architecture: Unified Citations

Core Principle

Keep [^N] footnotes as the authoring format. They're standard Markdown, LLMs generate them naturally, every page already uses them, and they're portable. The magic happens in the build/render pipeline that enriches them with resource metadata and verification data.

Data Model

Loading diagram...

Key Design Decisions

DecisionChoiceReasoning
Citation formatKeep [^N]LLMs generate naturally, standard Markdown, zero migration
Suppress gfm footnotesCSS display: noneSimple, reversible, no remark plugin complexity
Bibliography groupingBy unique source87 footnotes > 35 entries, much more readable
Resource creationAuto from URLsManual doesn't scale; 31/87 gap proves this
Rendering splitServer + client hybridServer renders bibliography; client adds hover cards

New Pipeline Step: Resource Auto-Registration

New crux command: pnpm crux citations register-resources <pageId>

  1. Parse all [^N]: Title: URL definitions from MDX

  2. Extract unique URLs (87 footnotes > ~35 unique URLs for Kalshi)

  3. For each URL without a resource YAML entry:

    • Use citation_content cache or fetch the URL
    • Extract: title, domain, type, published_date, authors
    • Generate resource ID (SHA-256 hash of URL, first 16 hex chars)
    • Create YAML entry in the appropriate data/resources/ file
  4. Report: "35 unique URLs, 31 already registered, 4 newly created"

Build Step: Footnote Index (not yet implemented)

The original proposal called for build-data.mjs to gain a footnoteIndex mapping every footnote number to its resource. This has not been built. Instead, the reference preprocessor (see Status section below) handles the mapping at MDX compile time for DB-driven references (cr-, rc-, kb-). Standard [^N] footnotes are still processed only by remark-gfm with no resource linkage.

A future footnoteIndex build step would enable richer rendering for standard footnotes (not just DB-driven ones):

{
  "footnoteIndex": {
    "kalshi": {
      "1": { "resourceId": "abc123", "url": "https://kalshi.com/about", "title": "About Kalshi" },
      "2": { "resourceId": "def456", "url": "https://research.contrary.com/...", "title": "Contrary Research" }
    }
  }
}

Proposed Component: UnifiedReferences (not yet implemented)

Would replace both the remark-gfm footnote section and the current References component:

  1. Groups footnotes by unique source (dedup): footnotes [17]-[21] all citing Sigma World > 1 entry

  2. Each source entry shows:

    • Title (linked to URL)
    • Metadata: publication/domain, author, year, type
    • Credibility badge
    • Verification dot (aggregate across all claims citing this source)
    • Back-refs: "Referenced by [5] [6] [15] [16] [17] [18] [19] [20] [21]"
  3. Expandable details:

    • Source summary
    • Per-claim verification table (from citation_quotes)
    • Supporting quotes from source

Proposed Component: InlineCitationCards (not yet implemented)

Would replace CitationOverlay with richer hover cards on [N] refs:

  • Source title + domain + credibility
  • Verification verdict + confidence score
  • Supporting quote from source
  • "View in References" link

Files Changed

StatusFilePurpose
Doneapps/web/src/lib/reference-preprocessor.tsPreprocesses [^cr-], [^rc-], [^kb-] markers into numbered footnotes
Donecrux/commands/footnotes.tsmigrate-cr command: converts [^cr-] to [^kb-] where URLs match KB facts
Proposedcrux/citations/register-resources.tsAuto-registration command
Proposedapps/web/src/components/wiki/UnifiedReferences.tsxUnified bibliography
Proposedapps/web/src/components/wiki/InlineCitationCards.tsxEnhanced hover cards
Proposedapps/web/scripts/build-data.mjsAdd footnoteIndex computation
Proposedapps/web/src/data/index.tsExport footnoteIndex accessor
Modifyapps/web/src/app/wiki/[id]/page.tsxWire up new components
ModifyGlobal CSSSuppress section[data-footnotes]
DeprecateReferences.tsxReplaced by UnifiedReferences
DeprecateCitationOverlay.tsxReplaced by InlineCitationCards
EvaluateResourceLink.tsx (<R>)No pages use it; likely remove

Implementation Plan: Three Phases

Phase 1: Get 3 Pages Very Right

Exemplar pages: Kalshi, Anthropic, existential-risk

Scope:

  1. Resource auto-registration: Build register-resources command. Run on all 3 pages. Every footnoted URL gets a resource YAML entry with metadata.
  2. Footnote index: Add footnoteIndex computation to build-data.mjs. Every footnote maps to a resource.
  3. UnifiedReferences: Build the new component. Groups by unique source, shows rich metadata, verification dots, back-refs.
  4. InlineCitationCards: Build enhanced hover cards merging resource metadata + verification data.
  5. Suppress gfm footnotes: CSS hides the duplicate remark-gfm section.
  6. Citation quality: Run full verification pipeline on all 3 pages. Fix broken citations. Review accuracy flags. Content polish.
  7. Visual polish: Dark mode, mobile tap-to-show, accessibility.

Exit criteria: All 3 exemplar pages have:

  • 100% resource coverage (every footnote > resource)
  • Single unified bibliography (no duplicate sections)
  • Rich hover cards on all inline [N] refs
  • Full citation verification with accuracy verdicts
  • 0 broken citations

Phase 2: Large Migration

Scope:

  1. Batch auto-registration: Run register-resources --all across all ~625 pages
  2. Resolve conflicts: Handle edge cases (URLs that 404, pages with unusual footnote formats)
  3. Rebuild all: Rebuild database.json with footnoteIndex for every page
  4. Verify rendering: Spot-check 20-30 pages across entity types
  5. Page quality tiers (optional): Add tier: showcase | standard | draft frontmatter. Pages below quality threshold auto-hidden from sidebar.

Exit criteria: Every page renders with UnifiedReferences. No page shows the old dual-bibliography layout.

Phase 3: Cleanup DB + Server + Code

Scope:

  1. Remove deprecated components: Delete References.tsx, CitationOverlay.tsx, ResourceLink.tsx
  2. Clean up imports: Remove <R> imports from all MDX files that import but don't use it
  3. DB cleanup: Ensure citation_quotes have resource_id for all entries. Clean stale citation_content entries.
  4. Server sync: Sync all new resources to wiki-server Postgres
  5. Code cleanup: Remove old pageResources computation path (subsumed by footnoteIndex)
  6. Documentation: Update architecture.mdx, CLAUDE.md references

Exit criteria: No dead code. Single code path for citations. DB consistent. Server synced.


Relationship to KB as the Structured Data Layer

The Knowledge Base (KB) has become the canonical structured data layer for the wiki. Rather than a separate "claim-first" architecture, the KB serves as the authoritative source for quantitative and factual data, with citations flowing through the [^3] mechanism.

What exists todayWhat it enables
KB facts YAML (packages/kb/data/things/)Single source of truth for structured data (counts, dates, funding, etc.)
[^3] footnotesPages cite KB facts directly; preprocessor generates sourced footnotes
migrate-cr toolingMigrates legacy [^cr-] claim references to [^kb-] where the source URL matches a KB fact
Reference preprocessorUnified handling of all DB-driven citation types at build time
What this proposal addsWhat it enables for KB
Resource auto-registrationEvery KB fact source gets a canonical resource entry with rich metadata
Unified bibliographyKB-sourced footnotes render with the same rich metadata as hand-curated resources
Full verification on exemplarsVerification data connects to KB facts, not just raw footnotes

The KB sits between raw sources and wiki pages: facts are extracted from sources (with provenance), stored in YAML, and cited by pages via [^kb-] markers. Building the resource registry and unified rendering makes this pipeline visible to readers.


Status (as of March 2026)

Implemented

ComponentLocationNotes
Reference preprocessorapps/web/src/lib/reference-preprocessor.tsHandles [^cr-], [^rc-], and [^kb-] markers at MDX compile time. Replaces markers with numbered footnotes and appends auto-generated definitions.
KB fact references[^kb-{factId}] syntax in MDXPages can cite KB facts directly. The preprocessor pulls value, source URL, date, and notes from KB YAML and generates a footnote definition.
migrate-cr commandpnpm crux footnotes migrate-crConverts legacy [^1] claim references to [^3] where the claim's source URL matches a KB fact's source.
CitationOverlayapps/web/src/components/wiki/CitationOverlay.tsxInjects verification dots onto footnote refs (still active, not yet replaced).
References componentapps/web/src/components/wiki/References.tsxRenders resource YAML bibliography (still active, not yet replaced).

Not Yet Implemented (Proposed)

ComponentDescription
Resource auto-registration (register-resources)Auto-create resource YAML entries for footnote URLs that lack them.
Footnote index (build step in build-data.mjs)Map every footnote number to its resource for the rendering layer.
UnifiedReferencesSingle bibliography component replacing both remark-gfm footnotes and References.
InlineCitationCardsEnhanced hover cards replacing CitationOverlay with resource metadata + verification.
CSS footnote suppressionHide the remark-gfm section[data-footnotes] once UnifiedReferences renders the bibliography.

Migration Progress

  • [^cr-] claim references: Legacy system. migrate-cr tooling exists to convert these to [^kb-] where source URLs match. Full migration is ongoing.
  • [^rc-] citation references: Active and stable. Used for DB-backed citations that don't map to KB facts.
  • [^kb-] KB fact references: The newest and preferred mechanism for structured data citations. Adoption is growing as KB fact coverage expands.

Open Questions

  1. Resource YAML organization: Should auto-created resources go in a separate file (auto-registered.yaml) or be sorted into existing category files?
  2. Page quality tiers: Automatic from quality score, or explicit frontmatter?
  3. How aggressive on hiding low-quality pages? Just from sidebar, or also from search?
  4. Person page exemplar: If we add a 4th exemplar later, which person page? Candidates: Dario Amodei, Eliezer Yudkowsky, Stuart Russell, Geoffrey Hinton.
  5. KB fact coverage threshold: At what point should all structured data citations use [^kb-] instead of [^rc-]? Should the improve pipeline auto-prefer KB facts when available?
  6. migrate-cr completion: Should the remaining [^cr-] references be bulk-migrated, or converted opportunistically as pages are improved?