Knowledge Graph Ontology: Design & Implementation Status

Motivation

This document was originally written (February 2025) when the wiki had two disconnected data layers: a small set of numeric facts in data/facts/*.yaml and thousands of unstructured prose claims across ~700 MDX pages. It proposed a property ontology to bridge those layers.

Since then, the KB system (packages/kb/) has been built and implements most of what was proposed here. The KB system provides:

100+ typed properties defined in packages/kb/data/properties.yaml with display config, categories, temporal flags, inverseId declarations, and appliesTo constraints
19 entity type schemas in packages/kb/data/schemas/ defining which properties are required/recommended per type
360+ entity files in packages/kb/data/things/ with structured facts (number, text, date, boolean, ref, refs, range, min, json value types)
Temporal support via asOf and validEnd fields on facts
Source attribution via source URL, sourceResource, sourceQuote, and notes fields
Inverse relationships auto-computed from inverseId declarations on properties
Record collections for structured sub-data (funding-rounds, key-people, products, model-releases, board-members, etc.)
23 validation rules covering integrity, consistency, and quality
Custom YAML tags (!ref for entity references, !date for dates)

The Fact System Strategy refined how numeric facts work. The Claim-First Architecture proposal described making all claims first-class data objects. This document tackled the missing piece between those two: the property ontology — what types of claims exist, what constraints they have, and how they connect to entities. The KB system is the realized implementation of that ontology.

The rest of this document preserves the original design thinking and worked examples, annotated with what KB now implements versus what remains aspirational.

What the KB System Implements

The KB system (packages/kb/) now covers most knowledge graph concepts that were originally gaps:

Knowledge graph concept	KB implementation	Status
Entity classes (types)	19 type schemas in `packages/kb/data/schemas/` (organization, person, concept, ai-model, etc.)	Implemented
Typed properties	95 properties in `packages/kb/data/properties.yaml` with `dataType`, `category`, `appliesTo`	Implemented
Domain constraints	`appliesTo: [organization]` on each property	Implemented
Temporal qualifiers	`asOf` and `validEnd` fields on individual facts	Implemented
Source attribution	`source` URL, `sourceResource`, `sourceQuote`, `notes` per fact	Implemented
Entity relationships	`ref`/`refs` fact types with `inverseId` auto-computation	Implemented
Non-numeric facts	text, date, boolean, ref, refs, range, min, json value types	Implemented
Record collections	Structured sub-data: funding-rounds, key-people, products, model-releases, board-members	Implemented
Confidence / verification	Not in KB; source/notes fields provide partial coverage	Not yet implemented
Claim-type taxonomy	Not in KB; no factual/consensus/analytical/speculative distinction	Not yet implemented

The remaining gap is the claim-level metadata proposed in Layer 4 below: confidence levels, verification status, claim-type taxonomy, and explicit reasoning chains for analytical claims. The KB system captures what is true with sources, but does not yet capture how confident we are or what type of assertion this is.

Design Principles

1. Schema.org over Wikidata

Wikidata's property constraints are advisory warnings that flag violations. Schema.org's approach is softer: domainIncludes means "this property is typically used with these types," not "using it elsewhere is wrong."

For our system — where LLM agents do much of the authoring — Schema.org's descriptive approach is better. Properties should suggest what's expected, not hard-block. This matches how applicableTo on measures already works.

2. Small vocabulary, enforced at write time

Wikidata has ~13,200 properties. Schema.org has ~1,500. SNOMED (biomedical) has 65 relationship types for millions of concepts.

For ~700 entities across 24 types, the research suggests 80-150 total properties. But most value comes from a small core — Wikidata's top 100 properties cover the vast majority of statements. We should start with 30-50 properties and grow based on actual data needs.

The critical constraint from Wikidata's experience: properties are hard to remove once data uses them. Be thoughtful about additions, but don't let that paralyze progress.

3. Qualifiers over property explosion

Without qualifiers, you get property proliferation:

# BAD: separate properties for temporal variants
kalshi-valuation-series-d: $5B
kalshi-valuation-series-e: $11B

# GOOD: one property with qualifiers
- property: valuation
  value: 11000000000
  qualifiers:
    pointInTime: 2025-12
    fundingRound: Series E
    source: resource-abc

The asOf field on existing facts is already a qualifier. Extending this pattern to all claims is natural.

4. Resources as the authoring unit

A single TechCrunch article produces 8-12 claims. A single annual report might produce 50+. The resource \to batch of claims relationship is the natural unit for authoring and review. Claims should track which resource they were extracted from, enabling workflows like "show me everything we learned from this source."

Proposed Property Ontology

Layer 1: Universal Properties (apply to all entity types)

Status: Largely implemented in KB. Each KB entity file (packages/kb/data/things/*.yaml) has a thing: header with id, name, type, wikiId, and aliases. Many of the properties below now exist as KB properties:

Property	Value type	Description	KB status
`name`	string	Primary name	Implemented (`thing.name`)
`description`	string	Short description	Implemented (MDX frontmatter `description`)
`founded-date`	date	When created/founded	Implemented (KB property, `appliesTo: [organization, funder]`)
`end-date`	date	When dissolved/died	Not yet a KB property
`status`	enum	active, inactive, dissolved, deceased	Not yet a KB property
`official-url`	url	Primary website	Implemented (KB property `website`)
`wikipedia-url`	url	Wikipedia page	Not yet a KB property
`wikidata-id`	string	Wikidata Q-identifier	Not yet a KB property
`parent-org`	ref	Parent org, broader concept	Not yet a KB property

Layer 2: Type-Specific Properties

Status: Implemented in KB. The 19 schema files in packages/kb/data/schemas/ define required and recommended properties per entity type. For example, organization.yaml recommends founded-date, headquarters, revenue, valuation, headcount, legal-structure, total-funding and defines record collections for funding-rounds, key-people, products, etc.

Organization properties (KB implementation):

Property	KB `dataType`	KB status	Notes
`headquarters`	text	Implemented	Location string
`founded-by`	refs	Implemented	With `inverseId: founder-of`
`legal-structure`	text	Implemented	nonprofit, llc, corp, etc.
`headcount`	number (temporal)	Implemented	With display config
`user-count`	number (temporal)	Implemented
`revenue`	number (temporal)	Implemented	Display: divisor 1e9, prefix $, suffix B
`valuation`	number (temporal)	Implemented	Display: divisor 1e9, prefix $, suffix B
`total-funding`	number (temporal)	Implemented
`market-share`	number (temporal)	Implemented	Display: suffix %
`gross-margin`	number (temporal)	Implemented	Display: suffix %

Person properties (KB implementation):

Property	KB `dataType`	KB status	Notes
`born-year`	number	Implemented
`education`	text	Implemented
`employed-by`	ref (temporal)	Implemented	With `inverseId: employer-of`
`role`	text (temporal)	Implemented	Job title at primary employer
`notable-for`	text	Implemented	One-line summary
`net-worth`	number (temporal)	Implemented
`founder-of`	refs (computed)	Implemented	Auto-computed from `founded-by` inverse

Risk/concept properties (from original proposal):

Property	KB `dataType`	KB status
`riskCategory`	enum	Not yet a KB property
`affectedBy`	refs	Not yet a KB property
`mitigatedBy`	refs	Not yet a KB property
`estimatedProbability`	number	Not yet a KB property

Risk and concept schemas exist (packages/kb/data/schemas/risk.yaml, concept.yaml) but have fewer defined properties than organization and person schemas. These remain an area for growth.

Layer 3: Relationship Properties (cross-entity claims)

Status: Partially implemented in KB. The KB system uses ref and refs data types for entity relationships, with inverseId declarations enabling automatic inverse computation. Many of the proposed relationship properties now exist:

Property	KB status	KB `inverseId`	Notes
`employed-by`	Implemented	`employer-of`	ref, temporal
`founded-by`	Implemented	`founder-of`	refs
`parent-org`	Not yet	—
`subsidiary-of`	Not yet	—
`invested-in`	Not yet	—	Partially covered by funding-rounds record collection
`partnered-with`	Not yet	—
`regulated-by`	Not yet	—
`sued-by`	Not yet	—
`competes-with`	Not yet	—
`addresses`	Not yet	—	For approach/policy to risk
`implemented-by`	Not yet	—	For approach to org/project

The KB's record collections (e.g., funding-rounds with lead_investor as a ref field) handle some relationship patterns that don't map cleanly to simple property-value facts. For example, a funding round involves an amount, date, valuation, and lead investor — this is better represented as a structured record than as separate investedIn facts.

Layer 4: Claim Types and Confidence

Status: Not yet implemented in KB. This layer remains aspirational. The KB system's source, sourceQuote, and notes fields provide basic provenance, but the structured confidence taxonomy below is not part of the current KB schema. This is the most significant gap between the proposal and the implementation.

Following the taxonomy from the Claim-First Architecture proposal:

Claim type	Verification strategy	Confidence levels	Example
factual	Check against source	verified, partial, unverified	"Kalshi was founded in 2018"
numeric	Check value + source; link to fact store	verified, estimated	"$11B valuation in Dec 2025"
consensus	Multiple independent sources	strong, moderate, weak	"Leading regulated US prediction market"
analytical	Check inference from supporting claims	supported, speculative	"Fee structure incentivizes liquidity"
speculative	Cannot verify — flag uncertainty	plausible, uncertain, contested	"Regulatory moat may narrow"
relational	Cross-entity verification	verified, partial	"Growth rate exceeded Polymarket's"

Worked Example: Kalshi Page Decomposed

Note (March 2026): This worked example was created before the KB system existed and uses a hypothetical claim schema (data/claims/kalshi.yaml) that was never implemented. Many of the properties used below (e.g., foundedBy, fundedBy, headquarters) now exist as KB properties in packages/kb/data/properties.yaml, and entity data like funding rounds are stored as structured items in KB entity files (packages/kb/data/things/kalshi.yaml). The claim-level metadata (confidence, type taxonomy, reasoning chains) shown here remains aspirational — the KB system does not yet track these. The example is preserved because it demonstrates the analytical decomposition process and the value of structured claims.

The Kalshi wiki page contains ~332 lines of prose with 87 footnoted citations. Below is how its content would decompose into structured claims using the property ontology above.

Entity Profile (Structured Properties)

These are property-value pairs on the Kalshi entity itself — the equivalent of a Wikidata item's core statements:

# Hypothetical format — not implemented. In the KB system, entity data lives
# in packages/kb/data/things/kalshi.yaml using the KB fact schema.
entity: kalshi
entityType: organization
subcategory: architecture

properties:
  name: "Kalshi"
  previousName: "Kownig"
  foundedDate: 2018
  headquarters: "New York, NY"
  officialUrl: "https://kalshi.com"
  wikipediaUrl: "https://en.wikipedia.org/wiki/Kalshi"
  wikidataId: "Q114586938"
  industry: [prediction-markets, financial-services, sports-betting]
  legalForm: corporation
  revenueModel: "Transaction fees (maker-taker: 0.07%-7% on takers)"
  regulatoryStatus:
    value: "CFTC-licensed Designated Contract Market (DCM)"
    qualifiers:
      grantedDate: 2020-11
      jurisdiction: federal
      source: sigma-world-timeline
  marketCategories:
    - politics
    - sports
    - economics
    - climate
    - finance-crypto
    - culture
    - technology-science

  founder:
    - entity: tarek-mansour
      role: CEO
    - entity: luana-lopes-lara
      role: COO

  # These link to numeric facts (existing system)
  factRefs:
    valuation: { factId: tbd, asOf: 2025-12, value: 11000000000 }
    totalFunding: { factId: tbd, asOf: 2025-12, value: 1500000000 }
    tradingVolume: { factId: tbd, asOf: 2025-12, value: "40-50B annualized" }

Factual Claims (52 claims)

These are verifiable assertions extracted from the prose. Each links to its source and can be independently verified, updated, or superseded.

# Hypothetical claim schema — not implemented as shown. The KB system stores
# facts per-entity in packages/kb/data/things/kalshi.yaml without claim-level
# metadata like confidence, type taxonomy, or reasoning chains.
entity: kalshi
claims:

  # === FOUNDING & HISTORY (8 claims) ===

  - id: c-kalshi-001
    property: foundedBy
    text: "Founded in 2018 by MIT graduates Tarek Mansour and Luana Lopes Lara"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        quote: "Founded in 2018 by Tarek Mansour and Luana Lopes Lara"
        support: full
    temporal: { type: historical, date: 2018 }
    entityRefs:
      - { id: kalshi, role: subject }
      - { id: tarek-mansour, role: founder }
      - { id: luana-lopes-lara, role: founder }
    cluster: kalshi-founding

  - id: c-kalshi-002
    property: educatedAt
    text: "Founders met at MIT studying computer science, mathematics, and electrical engineering"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: historical }
    entityRefs:
      - { id: tarek-mansour, role: subject }
      - { id: luana-lopes-lara, role: subject }
    cluster: kalshi-founding

  - id: c-kalshi-003
    property: employedAt
    text: "Founders completed internships at Goldman Sachs, Palantir, Five Rings Capital, and Citadel"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: historical }
    entityRefs:
      - { id: tarek-mansour, role: subject }
      - { id: luana-lopes-lara, role: subject }
    cluster: kalshi-founding

  - id: c-kalshi-004
    property: employedAt
    text: "Mansour worked as a quantitative trader at Citadel (May 2018 to May 2019)"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: historical, startDate: 2018-05, endDate: 2019-05 }
    entityRefs:
      - { id: tarek-mansour, role: subject }
    cluster: kalshi-founding

  - id: c-kalshi-005
    text: "Company was initially named 'Kownig'"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: historical, date: 2018 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-founding

  - id: c-kalshi-006
    text: "Joined Y Combinator Winter 2019 batch"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: historical, date: 2019 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-founding

  - id: c-kalshi-007
    text: "Founders motivated by gaps in hedging event outcomes, particularly after observing Brexit market shocks"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: historical }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-founding

  - id: c-kalshi-008
    text: "Early years focused on regulatory compliance — built exchange, broker, and surveillance systems before acquiring users"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: historical }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-founding

  # === REGULATORY MILESTONES (7 claims) ===

  - id: c-kalshi-010
    property: regulatedBy
    text: "CFTC approved Kalshi as first designated contract market (DCM) for event contracts"
    type: factual
    confidence: verified
    sources:
      - resourceId: sigma-world-timeline
        support: full
    temporal: { type: historical, date: 2020-11 }
    entityRefs:
      - { id: kalshi, role: subject }
    cluster: kalshi-regulation

  - id: c-kalshi-011
    text: "Official platform launch enabling trades on economic events"
    type: factual
    confidence: verified
    sources:
      - resourceId: sigma-world-timeline
        support: full
    temporal: { type: historical, date: 2021-07 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-regulation

  - id: c-kalshi-012
    text: "CFTC began examining political event contracts proposal"
    type: factual
    confidence: verified
    sources:
      - resourceId: sigma-world-timeline
        support: full
    temporal: { type: historical, date: 2022-08 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-regulation

  - id: c-kalshi-013
    text: "CFTC rejected political contracts, citing gambling risks"
    type: factual
    confidence: verified
    sources:
      - resourceId: sigma-world-timeline
        support: full
    temporal: { type: historical, date: 2023-09 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-regulation

  - id: c-kalshi-014
    property: suedBy
    text: "Kalshi sued CFTC for regulatory overreach"
    type: factual
    confidence: verified
    sources:
      - resourceId: sigma-world-timeline
        support: full
    temporal: { type: historical, date: 2023-11 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-regulation

  - id: c-kalshi-015
    text: "Federal court ruled unanimously in Kalshi's favor, allowing election contracts to resume"
    type: factual
    confidence: verified
    sources:
      - resourceId: sigma-world-timeline
        support: full
    temporal: { type: historical, date: 2024-09 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-regulation

  - id: c-kalshi-016
    text: "Robinhood launched prediction hub on Kalshi infrastructure"
    type: factual
    confidence: verified
    sources:
      - resourceId: sigma-world-timeline
        support: full
    temporal: { type: historical, date: 2024-11 }
    entityRefs:
      - { id: kalshi, role: subject }
    cluster: kalshi-partnerships

  # === FUNDING ROUNDS (7 claims) ===

  - id: c-kalshi-020
    property: fundedBy
    text: "Raised $6.1 million in seed round"
    type: numeric
    confidence: verified
    value: 6100000
    measure: funding-round
    sources:
      - resourceId: nasdaq-private-market-kalshi
        support: full
    temporal: { type: historical, date: 2020-03 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-funding

  - id: c-kalshi-021
    property: fundedBy
    text: "Raised $30 million Series A at $120 million valuation"
    type: numeric
    confidence: verified
    value: 30000000
    measure: funding-round
    qualifiers:
      valuation: 120000000
      round: "Series A"
    sources:
      - resourceId: cbinsights-kalshi
        support: full
    temporal: { type: historical, date: 2020-12 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-funding

  - id: c-kalshi-022
    property: fundedBy
    text: "Raised $60 million Series B"
    type: numeric
    confidence: verified
    value: 60000000
    measure: funding-round
    qualifiers:
      round: "Series B"
    sources:
      - resourceId: nasdaq-private-market-kalshi
        support: full
    temporal: { type: historical, date: 2025-06 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-funding

  - id: c-kalshi-023
    property: fundedBy
    text: "Raised $125 million Series C"
    type: numeric
    confidence: verified
    value: 125000000
    measure: funding-round
    qualifiers:
      round: "Series C"
    sources:
      - resourceId: nasdaq-private-market-kalshi
        support: full
    temporal: { type: historical, date: 2025-06 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-funding

  - id: c-kalshi-024
    property: fundedBy
    text: "Raised $300 million Series D at $5 billion valuation, led by Sequoia Capital and Andreessen Horowitz"
    type: numeric
    confidence: verified
    value: 300000000
    measure: funding-round
    qualifiers:
      valuation: 5000000000
      round: "Series D"
      leadInvestor: [sequoia-capital, andreessen-horowitz]
    sources:
      - resourceId: equityzen-kalshi
        support: full
    temporal: { type: historical, date: 2025-10 }
    entityRefs:
      - { id: kalshi, role: subject }
      - { id: sequoia-capital, role: investor }
      - { id: andreessen-horowitz, role: investor }
    cluster: kalshi-funding

  - id: c-kalshi-025
    property: fundedBy
    text: "Raised $1 billion Series E at $11 billion valuation"
    type: numeric
    confidence: verified
    value: 1000000000
    measure: funding-round
    qualifiers:
      valuation: 11000000000
      round: "Series E"
      leadInvestor: [paradigm]
      participants: [sequoia-capital, andreessen-horowitz, meritech-capital,
                     ivp, ark-invest, anthos-capital, capitalg, y-combinator]
    sources:
      - resourceId: kalshi-series-e-announcement
        quote: "$11 billion valuation"
        support: full
    temporal: { type: historical, date: 2025-12 }
    entityRefs:
      - { id: kalshi, role: subject }
    cluster: kalshi-funding

  - id: c-kalshi-026
    text: "Total funding raised exceeds $1.5 billion across nine rounds"
    type: numeric
    confidence: verified
    value: 1500000000
    measure: total-funding
    sources:
      - resourceId: cbinsights-kalshi
        support: full
    temporal: { type: point-in-time, asOf: 2025-12 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-funding

  # === PLATFORM OPERATIONS (6 claims) ===

  - id: c-kalshi-030
    text: "Operates as peer-to-peer marketplace using central limit order book (CLOB)"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-platform

  - id: c-kalshi-031
    text: "Binary yes/no contracts on verifiable real-world events; price reflects market probability"
    type: factual
    confidence: verified
    sources:
      - resourceId: kalshi-how-markets-work
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-platform

  - id: c-kalshi-032
    text: "Maker-taker fee structure: makers pay no fees, takers pay 0.07%-7%"
    type: factual
    confidence: verified
    sources:
      - resourceId: sacra-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-platform

  - id: c-kalshi-033
    text: "Real-time sentiment data valued at over $50,000 annually by institutional clients"
    type: numeric
    confidence: partial
    sources:
      - resourceId: sacra-kalshi
        support: partial  # Sacra estimate, not Kalshi-confirmed
    temporal: { type: point-in-time, asOf: 2025 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-platform

  - id: c-kalshi-034
    text: "Trading volumes exceeding $1 billion weekly by late 2025"
    type: numeric
    confidence: verified
    sources:
      - resourceId: kalshi-series-e-announcement
        support: full
    temporal: { type: point-in-time, asOf: 2025-12 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-scale

  - id: c-kalshi-035
    text: "Sports betting accounts for approximately 90% of trading volume"
    type: numeric
    confidence: verified
    sources:
      - resourceId: natlawreview-kalshi-records
        support: full
    temporal: { type: point-in-time, asOf: 2025-12 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-scale

  # === PARTNERSHIPS (10 claims) ===

  - id: c-kalshi-040
    property: partneredWith
    text: "Multiyear partnership with NHL including official data, logos, and broadcast signage"
    type: factual
    confidence: verified
    qualifiers:
      partnershipType: sports-data
      startDate: 2025
    sources:
      - resourceId: nhl-kalshi-partnership
        support: full
    temporal: { type: ongoing }
    entityRefs:
      - { id: kalshi, role: subject }
    cluster: kalshi-partnerships

  - id: c-kalshi-041
    property: partneredWith
    text: "First brand partnership between a North American sports team and a prediction market (Chicago Blackhawks)"
    type: factual
    confidence: verified
    sources:
      - resourceId: kalshi-blackhawks
        support: full
    temporal: { type: historical, date: 2025 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-042
    property: partneredWith
    text: "Three-year strategic partnership with STATSCORE for premium sports data"
    type: factual
    confidence: verified
    qualifiers:
      partnershipType: data-provider
      duration: "3 years"
    sources:
      - resourceId: statscore-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-043
    property: partneredWith
    text: "StockX partnership for sneaker/apparel/collectible event contracts"
    type: factual
    confidence: verified
    sources:
      - resourceId: kalshi-stockx
        support: full
    temporal: { type: historical, date: 2025-11 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-044
    property: partneredWith
    text: "Barchart partnership providing prediction data to 32 million investors"
    type: factual
    confidence: verified
    sources:
      - resourceId: kalshi-barchart
        support: full
    temporal: { type: historical, date: 2025-11 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-045
    property: partneredWith
    text: "Robinhood prediction hub built on Kalshi infrastructure; accounts for over half of Kalshi's betting volume"
    type: factual
    confidence: partial
    sources:
      - resourceId: cryptopolitan-kalshi-2026
        support: partial  # "reportedly" qualifier in source
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-046
    property: partneredWith
    text: "CNN integration with real-time prediction market news ticker"
    type: factual
    confidence: verified
    sources:
      - resourceId: popular-info-casinofication
        support: full
    temporal: { type: historical, date: 2025 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-047
    property: partneredWith
    text: "CNBC partnership for 2026 editorial coverage integration"
    type: factual
    confidence: verified
    sources:
      - resourceId: popular-info-casinofication
        support: full
    temporal: { type: historical, date: 2026 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-048
    property: partneredWith
    text: "Coinbase Custody partnership for USDC-powered event trading"
    type: factual
    confidence: verified
    sources:
      - resourceId: kalshi-coinbase-custody
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  - id: c-kalshi-049
    text: "Integration with TRON network, Phantom crypto wallet; serving as in-house prediction market for Coinbase"
    type: factual
    confidence: verified
    sources:
      - resourceId: natlawreview-kalshi-records
        support: full
    temporal: { type: historical, date: 2025 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-partnerships

  # === COMPETITION (4 claims) ===

  - id: c-kalshi-050
    property: competesWith
    text: "Kalshi and Polymarket form a duopoly, collectively over $44 billion trading volume in 2025"
    type: relational
    confidence: verified
    sources:
      - resourceId: cryptopolitan-kalshi-2026
        support: full
    temporal: { type: point-in-time, asOf: 2025 }
    entityRefs:
      - { id: kalshi, role: subject }
      - { id: polymarket, role: competitor }
    cluster: kalshi-competition

  - id: c-kalshi-051
    text: "Combined daily volume record of $799 million across both platforms (Jan 17, 2026)"
    type: numeric
    confidence: verified
    sources:
      - resourceId: thestreet-prediction-markets
        support: full
    temporal: { type: historical, date: 2026-01-17 }
    entityRefs:
      - { id: kalshi, role: subject }
      - { id: polymarket, role: mentioned }
    cluster: kalshi-competition

  - id: c-kalshi-052
    text: "Polymarket operates using cryptocurrency and has faced restrictions on US users"
    type: factual
    confidence: verified
    sources:
      - resourceId: contrary-research-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs:
      - { id: polymarket, role: subject }
      - { id: kalshi, role: implied_comparison }
    cluster: kalshi-competition

  - id: c-kalshi-053
    text: "Kalshi's competitive advantages: legal clarity for US users, segregated customer funds with deposit insurance, institutional partnerships"
    type: analytical
    confidence: partial
    supportingClaims: [c-kalshi-010, c-kalshi-040, c-kalshi-045]
    sources:
      - resourceId: kalshi-about
        support: partial
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-competition

  # === REGULATORY CHALLENGES (7 claims) ===

  - id: c-kalshi-060
    property: suedBy
    text: "New York State Gaming Commission issued cease and desist for offering sports betting without state license"
    type: factual
    confidence: verified
    sources:
      - resourceId: cbs-kalshi-regulation
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-legal

  - id: c-kalshi-061
    property: suedBy
    text: "Federal judge ruled Kalshi must stop offering prediction contracts in Nevada"
    type: factual
    confidence: verified
    sources:
      - resourceId: nevada-independent-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-legal

  - id: c-kalshi-062
    text: "Nearly two dozen states and tribal gaming authorities have filed federal lawsuits"
    type: factual
    confidence: verified
    sources:
      - resourceId: nevada-independent-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-legal

  - id: c-kalshi-063
    text: "US District Judge Andrew Gordon ruled event contracts on sporting outcomes do not fall within CFTC's exclusive jurisdiction"
    type: factual
    confidence: verified
    sources:
      - resourceId: nevada-independent-kalshi
        quote: "do not fall within the CFTC's exclusive jurisdiction"
        support: full
    temporal: { type: historical }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-legal

  - id: c-kalshi-064
    text: "Native American gaming tribes filed briefs calling activities 'brazenly illegal'"
    type: factual
    confidence: verified
    sources:
      - resourceId: nexteventhorizon-kalshi-nfl
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-legal

  - id: c-kalshi-065
    text: "Class action lawsuit claiming market makers place customers at disadvantage"
    type: factual
    confidence: verified
    sources:
      - resourceId: igaming-kalshi-class-action
        support: full
    temporal: { type: historical, date: 2025 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-legal

  - id: c-kalshi-066
    text: "Separate New York class action alleging illegal sports betting and deceptive practices"
    type: factual
    confidence: verified
    sources:
      - resourceId: wikipedia-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-legal

  # === OPERATIONAL CONCERNS (5 claims) ===

  - id: c-kalshi-070
    text: "Incorrectly graded multiple NFL win totals markets; refunded original amounts without paying winners"
    type: factual
    confidence: verified
    sources:
      - resourceId: nexteventhorizon-kalshi-nfl
        support: full
    temporal: { type: historical }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-operations

  - id: c-kalshi-071
    text: "Minimal age verification mechanisms despite advertising as legal for 18+"
    type: factual
    confidence: verified
    sources:
      - resourceId: gamblingharm-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-operations

  - id: c-kalshi-072
    text: "No prominent messaging or resources for users experiencing gambling problems"
    type: factual
    confidence: verified
    sources:
      - resourceId: gamblingharm-kalshi
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-operations

  - id: c-kalshi-073
    text: "Insider trading violations do not currently carry criminal penalties on regulated prediction platforms"
    type: factual
    confidence: verified
    sources:
      - resourceId: axios-kalshi-insider-trading
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-operations

  - id: c-kalshi-074
    text: "$30,000 bet on Polymarket yielded $436,760 from suspected insider information (Maduro capture contract)"
    type: factual
    confidence: verified
    sources:
      - resourceId: axios-kalshi-insider-trading
        support: full
    temporal: { type: historical }
    entityRefs:
      - { id: polymarket, role: subject }
      - { id: kalshi, role: context }
    cluster: kalshi-operations

  # === AI SAFETY RELEVANCE (3 claims) ===

  - id: c-kalshi-080
    text: "AI Research Pause Market (KXAI PAUSE-27) assigns 12% probability to any major AI company pausing research for safety before January 2027"
    type: numeric
    confidence: verified
    value: 0.12
    sources:
      - resourceId: kalshi-ai-pause-market
        support: full
    temporal: { type: point-in-time, asOf: 2026-02 }
    entityRefs:
      - { id: kalshi, role: platform }
      - { id: anthropic, role: mentioned }
      - { id: openai, role: mentioned }
    cluster: kalshi-ai-safety

  - id: c-kalshi-081
    text: "Offers contracts on whether AI regulation will become federal law by January 2027 (KXAI LEGISLATION-27)"
    type: factual
    confidence: verified
    sources:
      - resourceId: kalshi-ai-legislation-market
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: platform }]
    cluster: kalshi-ai-safety

  - id: c-kalshi-082
    text: "Low probability assignments suggest trader skepticism about imminent AI safety pauses"
    type: analytical
    confidence: partial
    supportingClaims: [c-kalshi-080]
    sources: []
    temporal: { type: point-in-time, asOf: 2026-02 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-ai-safety

  # === CONSENSUS & ANALYTICAL CLAIMS (5 claims) ===

  - id: c-kalshi-090
    text: "Kalshi's mission is to democratize finance by enabling everyday users to capitalize on opinions and hedge personal risks"
    type: factual
    confidence: verified
    sources:
      - resourceId: kalshi-about
        support: full
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]

  - id: c-kalshi-091
    text: "EA Forum communities give generally positive but cautious reception; recognize legal milestones but note attention costs and limited EA-relevant coverage"
    type: consensus
    confidence: moderate
    consensusBreadth: 3
    sources:
      - resourceId: ea-forum-prediction-markets-corporate
        support: partial
      - resourceId: ea-forum-predicting-for-good
        support: partial
      - resourceId: kalshi-harnessing-prediction-markets
        support: partial
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-reception

  - id: c-kalshi-092
    text: "Fee structure designed to incentivize liquidity provision while remaining competitive with traditional sportsbook vigorish"
    type: analytical
    confidence: partial
    supportingClaims: [c-kalshi-032]
    reasoning: "Zero maker fees + low taker fees (0.07-7%) is typical exchange design for liquidity. Fee range is lower than typical sportsbook vig (5-10%)."
    sources: []
    temporal: { type: ongoing }
    entityRefs: [{ id: kalshi, role: subject }]

  - id: c-kalshi-093
    text: "Prediction markets systematically fail on fat-tailed, improbable events like revolutions"
    type: consensus
    confidence: moderate
    consensusBreadth: 2
    sources:
      - resourceId: kalshi-harnessing-prediction-markets
        support: partial
    temporal: { type: ongoing }
    entityRefs:
      - { id: kalshi, role: context }
      - { id: prediction-markets, role: subject }
      - { id: philip-tetlock, role: mentioned }
    cluster: kalshi-reception

  - id: c-kalshi-094
    text: "5.5x valuation increase from Series D to Series E in just six months"
    type: numeric
    confidence: verified
    value: 5.5
    supportingClaims: [c-kalshi-024, c-kalshi-025]
    sources:
      - resourceId: signalhub-kalshi
        support: full
    temporal: { type: historical, date: 2025-12 }
    entityRefs: [{ id: kalshi, role: subject }]
    cluster: kalshi-funding

Summary Statistics

Category	Claim count	Types
Founding & history	8	8 factual
Regulatory milestones	7	7 factual
Funding rounds	7	6 numeric, 1 factual
Platform operations	6	4 factual, 2 numeric
Partnerships	10	10 factual
Competition	4	1 relational, 1 numeric, 1 factual, 1 analytical
Legal challenges	7	7 factual
Operational concerns	5	5 factual
AI safety relevance	3	1 numeric, 1 factual, 1 analytical
Consensus & analytical	5	2 consensus, 2 analytical, 1 numeric
Total	62	48 factual, 7 numeric, 2 consensus, 3 analytical, 1 relational, 1 numeric-derived

Clusters Used

Cluster	Claims	What it groups
`kalshi-founding`	8	Origin story, founders, early decisions
`kalshi-regulation`	7	CFTC approval and political contracts saga
`kalshi-funding`	8	All funding rounds + totals
`kalshi-platform`	4	How the product works
`kalshi-scale`	2	Volume and usage metrics
`kalshi-partnerships`	10	All partnership claims
`kalshi-competition`	4	Polymarket comparison, market dynamics
`kalshi-legal`	7	State lawsuits, tribal opposition, class actions
`kalshi-operations`	5	Grading errors, consumer protection, insider trading
`kalshi-ai-safety`	3	AI-related prediction markets
`kalshi-reception`	2	EA/forecasting community views

Properties Used

Property	Usage count	Domain constraint
`partneredWith`	9	organization
`fundedBy`	6	organization
`suedBy`	3	organization, person
`employedAt`	2	person
`foundedBy`	1	organization
`educatedAt`	1	person
`regulatedBy`	1	organization
`competesWith`	1	organization
(no property — free claim)	38	—

This reveals something important: 24 of 62 claims use typed relationship properties (39%), while 38 are free-text claims without a formal property. This is expected and fine — not every assertion maps cleanly to a property. The free claims still benefit from structured metadata (source, confidence, temporality, entity refs, cluster).

Observations from the Kalshi Exercise

What decomposition reveals

Source concentration. Of 62 claims, 12 come from a single source (Contrary Research). Processing that one resource yields ~20% of all claims. This validates the resource-centric authoring workflow.
Temporal diversity matters. Claims span historical (founding, milestones), point-in-time (valuations, volumes), and ongoing (platform operations, legal status). Without temporal typing, a system can't distinguish "founded in 2018" (never changes) from "90% sports volume" (changes quarterly).
Cross-entity claims are high-value. The competition claims (c-kalshi-050 through c-kalshi-053) reference both Kalshi and Polymarket. These are exactly the claims that should appear on both entity pages and on a prediction markets comparison page — the multi-page reuse case.
Analytical claims need explicit reasoning chains. Claims c-kalshi-053, c-kalshi-082, and c-kalshi-092 are the wiki's own analysis. Making supportingClaims explicit prevents these from being treated as sourced facts and enables validation ("do the supporting claims actually support this inference?").
Clusters emerge naturally. The 11 clusters correspond roughly to the page's section structure but aren't identical. The kalshi-partnerships cluster spans what's actually two page sections (Sports + Data/Technology), while kalshi-regulation spans three sections. Clusters reflect topical groupings; page sections reflect editorial choices about presentation.
Most claims don't need relationship properties. Only 39% of claims use typed properties. The rest are free assertions with structured metadata. This means the property vocabulary can stay small — you don't need a property for every possible assertion.

What this enables

With these 62 claims structured, you could:

Generate a comparison page — Pull all competesWith claims plus numeric facts for Kalshi and Polymarket, compose a comparison view automatically.
Detect staleness — The trading volume and AI market probability claims have temporal.asOf: 2025-12. If it's now March 2026, flag them for update.
Track source quality — 12 claims from Contrary Research, 7 from Sigma World, 6 from Kalshi's own announcements. If Contrary Research publishes an update, you know which claims to re-verify.
Build entity timelines — Filter claims by temporal.date, sort chronologically, and you have a timeline view without any additional authoring.
Cross-entity consistency — If Polymarket's page says "$44B combined volume" but the Kalshi page says "$44B," the system can surface that these reference the same underlying fact via the shared cluster.
Surface analytical gaps — The kalshi-ai-safety cluster has only 3 claims, all fairly shallow. A dashboard could flag "AI safety relevance is under-developed for this entity."

Property Ontology Design Decisions

Decision 1: How many properties to start with?

Resolved. The KB system launched with ~100 properties and continues to grow. The original recommendation of 25-30 was conservative; in practice, the combination of financial metrics, biographical data, organizational structure, and AI-specific properties required a larger vocabulary. This aligns with the document's own research note that "80-150 total properties" would be needed.

Originally recommended initial vocabulary (most now exist in KB):

#	Property	Domain	Range
1	`foundedBy`	organization	person
2	`ceo`	organization	person
3	`employedAt`	person	organization
4	`educatedAt`	person	organization
5	`investedIn`	organization, person	organization
6	`fundedBy`	organization	organization, person
7	`acquiredBy`	organization	organization
8	`partneredWith`	organization	organization
9	`regulatedBy`	organization	organization
10	`suedBy`	organization	organization, person
11	`competesWith`	organization	organization
12	`parentOrg`	organization	organization
13	`subsidiaryOf`	organization	organization
14	`addresses`	approach, policy, project	risk, risk-factor
15	`mitigatedBy`	risk	approach, policy
16	`implementedBy`	approach	organization, project
17	`critiquedBy`	any	person, argument
18	`endorsedBy`	any	person, organization
19	`developedModel`	organization	model
20	`authoredBy`	resource, argument	person
21	`advisorTo`	person	organization
22	`boardMemberOf`	person	organization
23	`previouslyAt`	person	organization
24	`causedBy`	event, risk	any
25	`ledTo`	event	event

Decision 2: Where do properties live?

Resolved. Properties live in packages/kb/data/properties.yaml. The format evolved from what was proposed but follows the same principles:

# packages/kb/data/properties.yaml (actual format)
properties:
  founded-by:
    name: Founded By
    description: "Person(s) who founded this organization"
    dataType: refs
    category: people
    appliesTo: [organization]
    inverseId: founder-of
    inverseName: Founded

  employed-by:
    name: Employed By
    description: "Organization this person currently or historically works for"
    dataType: ref
    category: people
    temporal: true
    appliesTo: [person]
    inverseId: employer-of
    inverseName: Employs

  revenue:
    name: Revenue
    description: "Annualized run-rate revenue (ARR) or trailing twelve-month revenue"
    dataType: number
    unit: USD
    category: financial
    temporal: true
    appliesTo: [organization]
    display:
      divisor: 1e9
      prefix: "$"
      suffix: "B"

Key differences from the original proposal: kebab-case property IDs instead of camelCase, appliesTo instead of domainIncludes/rangeIncludes, inverseId instead of inverse, and display config for rendering. The rangeIncludes concept is implicit in the dataType (ref/refs targets are validated against entity existence, not type-constrained).

Decision 3: How do claims and facts coexist?

Partially resolved by KB. In the KB system, all structured data lives in a single place: entity files in packages/kb/data/things/. Facts are stored directly on entities with source attribution:

# packages/kb/data/things/kalshi.yaml (actual KB format)
facts:
  - id: f_kalshi_val_2025
    property: valuation
    value: 11e9
    asOf: 2025-12
    source: https://example.com/kalshi-series-e
    sourceQuote: "$11 billion valuation"
    notes: "Series E round led by Paradigm"

This eliminates the dual-store problem. The original proposal's distinction between "fact store" (display, timeseries) and "claim store" (provenance, confidence) collapsed into a single KB fact with both display config (from properties.yaml) and provenance (source fields on each fact). The claim-level metadata (confidence, type taxonomy) proposed in Layer 4 is not yet part of KB facts.

Decision 4: Advisory constraints or hard enforcement?

Resolved: Advisory with validation. The KB system follows the Schema.org-style approach recommended here. appliesTo constraints in properties.yaml are enforced by KB validation rules — using a property on an entity type it doesn't apply to generates a validation warning — but the system does not hard-block fact creation. The 23 validation rules cover integrity (valid property IDs, valid refs), consistency (temporal flags match property declarations), and quality (source coverage) without being overly restrictive.

Implementation Path

Phase 1: Property vocabulary (data layer only) — COMPLETE

~~Create data/properties.yaml with the initial 25 properties~~ -- Done: packages/kb/data/properties.yaml with 95 properties
~~Add a Property Zod schema to data/schema.ts~~ -- Done: KB validation in packages/kb/
~~Validate property references in build-data~~ -- Done: 23 validation rules

Phase 2: Entity data population — COMPLETE

~~Pilot with a few entities~~ -- Done: 360+ entities in packages/kb/data/things/
~~Build dashboards~~ -- Done: citation dashboards (E917, E918, E919); legacy Fact dashboard (E898) removed
~~Define schemas per entity type~~ -- Done: 19 schemas in packages/kb/data/schemas/

Phase 3: Extraction and display tooling — COMPLETE

~~Integrate KB facts into wiki pages~~ -- Done: <FBF>, <FBFactValue>, and <Calc> MDX components
~~Build fact extraction pipeline~~ -- Done: crux footnotes commands for migrating claims to KB facts
~~Validate property constraints and source coverage~~ -- Done: KB validation rules

Phase 4: Claim-level metadata — NOT STARTED

This is the remaining gap. The KB system captures what is true with sources, but does not yet support:

Claim-type taxonomy (factual, consensus, analytical, speculative)
Confidence levels (verified, partial, unverified)
Explicit reasoning chains for analytical claims (supportingClaims)
Verification status tracking
Claim clustering for topical grouping

Open Questions

Some original questions have been resolved by KB; others remain open.

~~Claim ID format.~~ Resolved: KB uses f_ prefixed IDs (e.g., f_qR5tY9wE1a) — 10-char alphanumeric, generated, collision-resistant. The c-kalshi-001 format from this proposal was never adopted.
~~File granularity.~~ Resolved: One file per entity in packages/kb/data/things/. Large entities have many facts in one file, which works well in practice.
Who maintains facts? Partially resolved. The crux footnotes pipeline can migrate prose claims to KB facts. The content improve pipeline (crux content improve) writes prose but does not yet update KB facts directly. KB facts and prose coexist, with <FBF> components bridging them in MDX.
How do facts and prose stay in sync? The <FBF> and <FBFactValue> MDX components render KB fact values inline in prose, so updating a KB fact automatically updates the displayed value. For prose that doesn't use these components, sync remains manual.
Property hierarchy. Not yet implemented. KB properties have category groupings (financial, people, biographical, organization, product, etc.) but no formal sub-property relationships. With 95 properties, some hierarchy may become useful for dashboard navigation.
Claim-level metadata. (New) Should the KB system add confidence/verification fields to facts? The current source/notes fields are free-text. Adding structured confidence would enable the verification workflows described in Layer 4, but adds complexity to every fact entry.

Current Status Summary (March 2026)

Proposal area	Status	Where it lives
Property vocabulary	Implemented (95 properties)	`packages/kb/data/properties.yaml`
Entity type schemas	Implemented (19 types)	`packages/kb/data/schemas/*.yaml`
Entity data files	Implemented (360+ entities)	`packages/kb/data/things/*.yaml`
Value types (number, text, date, ref, etc.)	Implemented (9 types)	KB fact schema
Temporal qualifiers	Implemented (`asOf`, `validEnd`)	KB fact fields
Source attribution	Implemented (source, sourceResource, sourceQuote, notes)	KB fact fields
Inverse relationships	Implemented (auto-computed from `inverseId`)	`properties.yaml` declarations
Record collections (funding-rounds, etc.)	Implemented	Entity type schemas
Display configuration	Implemented (divisor, prefix, suffix)	`properties.yaml` display config
Validation	Implemented (23 rules)	KB validation system
MDX integration	Implemented (`<FBF>`, `<FBFactValue>`, `<Calc>`)	MDX components
Fact extraction pipeline	Implemented	`crux footnotes` commands
Claim-type taxonomy	Not implemented	—
Confidence / verification levels	Not implemented	—
Reasoning chains (`supportingClaims`)	Not implemented	—
Claim clustering	Not implemented	—

The KB system has successfully realized the core data model proposed in this document. The remaining aspirational layer (claim-level metadata for confidence, verification, and reasoning) represents an evolution beyond what most knowledge graphs implement, and may be pursued if the use cases for it become clearer.