Skip to content
Longterm Wiki
Updated 2026-03-16HistoryData
Page StatusDocumentation
Edited 7 days ago1.4k wordsUpdated quarterlyDue in 12 weeks
Content3/13
SummaryScheduleEntityEdit historyOverview
Tables11/ ~5Diagrams1/ ~1Int. links5/ ~11Ext. links1/ ~7Footnotes0/ ~4References0/ ~4Quotes0Accuracy0

Data Architecture: Three Bases and Naming Guide

This document is the canonical naming reference for the wiki's data architecture. It explains the three conceptual data layers ("Bases"), maps each PostgreSQL table to its Base, and clarifies common naming confusions.


The Three Bases

The wiki organizes data into three conceptual layers:

BaseWhat it storesPrimary source of truthKey access module
TableBaseTyped relational records — entities, resources, publications, experts, organizationsYAML files in data/entities/, data/resources/apps/web/src/data/tablebase.ts
FactBaseStructured triples with temporal data and provenance — facts about entitiesYAML files in packages/factbase/data/things/apps/web/src/data/factbase.ts
WikiBaseLong-form prose MDX articles — the actual wiki pages readers seeMDX files in content/docs/Page interface in tablebase.ts
Diagram (loading…)
flowchart TB
  subgraph Sources["Source of Truth"]
      YAML["data/entities/*.yaml
data/resources/*.yaml"]
      FB_YAML["packages/factbase/data/things/*.yaml"]
      MDX["content/docs/**/*.mdx"]
  end
  subgraph Build["Build Pipeline"]
      BD["build-data.mjs"]
  end
  subgraph Artifacts["Build Artifacts"]
      DB_JSON["database.json
(TableBase + WikiBase)"]
      FB_JSON["factbase-data.json
(FactBase)"]
  end
  subgraph PG["PostgreSQL (wiki-server)"]
      direction TB
      TB_PG["TableBase tables
(entities, resources,
entity_ids, summaries)"]
      FB_PG["FactBase mirror
(facts table)"]
      WB_PG["WikiBase mirror
(wiki_pages)"]
      UNI["Cross-Base index
(things table)"]
      OPS["Operational tables
(sessions, jobs,
citation_quotes, etc.)"]
  end
  YAML --> BD
  FB_YAML --> BD
  MDX --> BD
  BD --> DB_JSON
  BD --> FB_JSON
  BD --> TB_PG
  BD --> FB_PG
  BD --> WB_PG
  BD --> UNI

PG Tables Grouped by Base

TableBase tables (entity catalog)

These tables mirror the YAML entity/resource catalog. YAML files remain authoritative; these PG tables are queryable read mirrors for the API.

PG tableDrizzle exportPurpose
entitiesentitiesRead mirror of data/entities/*.yaml. One row per entity (org, person, risk, etc.).
entity_idsentityIdsCentral ID registry. Maps numeric IDs (E42) to slugs. Sequence-allocated.
resourcesresourcesRead mirror of data/resources/*.yaml. Papers, blog posts, reports.
summariessummariesLLM-generated entity summaries. One per entity, keyed by entities.stable_id.
page_linkspageLinksDirectional knowledge graph between entities/pages.
resource_citationsresourceCitationsMany-to-many join: which resources are cited on which pages.

FactBase tables (structured facts)

YAML files in packages/factbase/data/things/ are currently the primary source. They sync to PG via crux wiki-server sync-facts. The PG facts table provides an export endpoint (GET /api/facts/export) for consumers that need PG-backed access. Once the PG schema includes all Fact fields (validEnd, currency, etc.), PG will become the primary source.

PG tableDrizzle exportPurpose
factsfactsPG mirror of FactBase YAML. Numeric/string facts with timeseries support via measure + as_of. Export via /api/facts/export.
propertiespropertiesControlled vocabulary for fact property types (valuation, headcount, ceo, etc.).
factbase_resource_verificationsReplaced by unified source_check_evidence table (migration 0127).
factbase_verdictsReplaced by unified source_check_verdicts table (migration 0127).

WikiBase tables (prose content)

These tables mirror the MDX wiki pages.

PG tableDrizzle exportPurpose
wiki_pageswikiPagesMirror of ≈700 MDX pages. Full-text searchable. Dual-ID: text id (legacy) + integer_id (Phase 4a).
edit_logseditLogsPer-page edit history with tool/agency attribution.
page_improve_runspageImproveRunsRecords of AI-driven page improvement runs.

Cross-Base index (the things table)

The things table is a cross-base universal index used for search. Every identifiable item in the system (entity, fact, grant, resource, personnel record, division, etc.) gets a single row. This enables:

  • Cross-domain search (search everything in one query)
  • A single browse UI for all data
PG tableDrizzle exportPurpose
thingsthingsUniversal search index. thing_type indicates domain (entity, fact, grant, etc.). source_table + source_id point back to the originating record.

Unified Verification tables

All verification data lives in two tables (replacing the previous six). See discussion #2950.

PG tableDrizzle exportPurpose
source_check_evidencesourceCheckEvidencePer-source checks for any record type. Supports row-level (field_name = NULL) and cell-level (field_name = column name) source-checking.
source_check_verdictssourceCheckVerdictsAggregate verdicts per claim. Keyed by (record_type, record_id, COALESCE(field_name, '')).

Operational tables

These tables are not part of any Base. They track system operations, CI/CD, and agent activity.

PG tableDrizzle exportPurpose
citation_quotescitationQuotesPer-footnote citation verification data.
citation_contentcitationContentCached fetched HTML/text from source URLs.
citation_accuracy_snapshotscitationAccuracySnapshotsPage-level citation health aggregations.
hallucination_risk_snapshotshallucinationRiskSnapshotsPer-page hallucination risk scores.
page_citationspageCitationsNon-claim footnote citations.
sessionssessionsLegacy session log (being superseded by agent_sessions).
session_pagessessionPagesPages modified per session.
agent_sessionsagentSessionsFull agent session lifecycle.
agent_session_pagesagentSessionPagesPages modified per agent session.
agent_session_eventsagentSessionEventsAgent audit trail.
active_agentsactiveAgentsLive agent coordination with heartbeat.
auto_update_runsautoUpdateRunsAuto-update pipeline run history.
auto_update_resultsautoUpdateResultsPer-page results from auto-update runs.
auto_update_news_itemsautoUpdateNewsItemsDiscovered news items from RSS feeds.
jobsjobsBackground task queue.
groundskeeper_runsgroundskeeperRunsMaintenance daemon execution history.
service_health_incidentsserviceHealthIncidentsInfrastructure incident tracking.
personnelpersonnelPersonnel records (person-to-org role assignments).
grantsgrantsGrant records.
funding_roundsfundingRoundsCompany funding round data.
investmentsinvestmentsInvestment records.
equity_positionsequityPositionsEquity ownership snapshots.
divisionsdivisionsOrganizational sub-units.
division_personneldivisionPersonnelDivision staff assignments.
funding_programsfundingProgramsOpen funding opportunities.
benchmarksbenchmarksEvaluation benchmark definitions.
benchmark_resultsbenchmarkResultsModel scores on benchmarks.
record_verificationsReplaced by unified source_check_evidence table (migration 0127).
record_verdictsReplaced by unified source_check_verdicts table (migration 0127).
research_areasresearchAreasResearch area taxonomy.

Naming Confusions (and How to Read Them)

"Entity" means different things in different contexts

ContextWhat "entity" meansExample
data/entities/*.yamlA YAML catalog entry describing a real-world thing (org, person, risk, concept)data/entities/organizations.yaml has an entry for Anthropic
entities PG tableA read mirror of those same YAML entriesSELECT * FROM entities WHERE id = 'anthropic'
FactBase Entity typeA FactBase thing — an entity in the structured facts system with its own ID schemepackages/factbase/data/things/anthropic.yaml has id: mK9pX3rQ7n
factbase.ts getFactBaseEntity()Returns a FactBase entity by its FactBase ID or YAML slugReturns the FactBase entity object

Key distinction: The YAML/PG entities table uses slug-based IDs (e.g., anthropic), while FactBase entities have their own 10-character alphanumeric IDs (e.g., mK9pX3rQ7n). The factbase-data.json file includes a slugToEntityId mapping to bridge between them.

"Things" means different things in different contexts

ContextWhat "things" meansExample
packages/factbase/data/things/FactBase entity YAML files — one file per entity with facts, properties, and metadatapackages/factbase/data/things/anthropic.yaml
things PG tableA cross-base universal index that indexes items from ALL domainsA row with thing_type='entity' and source_table='entities'

Key distinction: The FactBase "things" directory (packages/factbase/data/things/) contains YAML data files that define entities and their facts. The PG things table is a completely separate concept — it is a universal search/browse index that contains rows pointing to entities, facts, grants, resources, personnel, and other record types. They share a name but serve unrelated purposes.

"Facts" means different things in different contexts

ContextWhat "facts" meansExample
packages/factbase/data/things/*.yaml factsStructured triples in the FactBase YAML package (the source of truth)property: revenue, value: 6000000000, asOf: 2025-01
facts PG tableA read mirror of FactBase YAML facts in PostgreSQLRows synced from YAML during build
data/facts/*.yaml (legacy)The old YAML facts system — deprecated for entities that have FactBase entriesPowers legacy \<F\> and \<Calc\> components
tablebase.ts Fact interfaceA legacy bridge type kept for backward compatibilityUsed by calc-engine and a few old components

Key distinction: The authoritative source for structured facts is the FactBase YAML (packages/factbase/data/things/). The PG facts table is a read mirror for API queries. The old data/facts/*.yaml system is deprecated for FactBase-covered entities (see Data System Authority Rules).


Code Module Map

ModuleBaseRole
apps/web/src/data/tablebase.tsTableBaseLoads database.json, provides entity/resource/page lookups
apps/web/src/data/factbase.tsFactBaseLoads factbase-data.json, provides fact/property/record lookups
apps/web/scripts/build-data.mjsAllTransforms YAML + MDX into JSON build artifacts and syncs to PG
packages/factbase/FactBaseCore FactBase package — serialization, types, YAML loading
apps/wiki-server/src/schema.tsAllDrizzle ORM schema defining all PG tables
apps/wiki-server/src/routes/entities.tsTableBaseAPI for YAML entity data
apps/wiki-server/src/routes/facts.tsFactBaseAPI for FactBase fact data
apps/wiki-server/src/routes/pages.tsWikiBaseAPI for wiki page metadata and search
apps/wiki-server/src/routes/things.tsCross-BaseAPI for the universal things index

  • System Architecture — High-level technical overview
  • DB Schema Overview — Full ER diagrams and migration history
  • Data System Authority Rules — Which data system is authoritative for each entity
  • Fact System Strategy — Strategy for the old YAML facts system