Knowledge Graph Ontology: Design & Implementation Status
Motivation
This document was originally written (February 2025) when the wiki had two disconnected data layers: a small set of numeric facts in data/facts/*.yaml and thousands of unstructured prose claims across ~700 MDX pages. It proposed a property ontology to bridge those layers.
Since then, the KB system (packages/kb/) has been built and implements most of what was proposed here. The KB system provides:
- 100+ typed properties defined in
packages/kb/data/properties.yamlwith display config, categories, temporal flags,inverseIddeclarations, andappliesToconstraints - 19 entity type schemas in
packages/kb/data/schemas/defining which properties are required/recommended per type - 360+ entity files in
packages/kb/data/things/with structured facts (number, text, date, boolean, ref, refs, range, min, json value types) - Temporal support via
asOfandvalidEndfields on facts - Source attribution via
sourceURL,sourceResource,sourceQuote, andnotesfields - Inverse relationships auto-computed from
inverseIddeclarations on properties - Record collections for structured sub-data (funding-rounds, key-people, products, model-releases, board-members, etc.)
- 23 validation rules covering integrity, consistency, and quality
- Custom YAML tags (
!reffor entity references,!datefor dates)
The Fact System Strategy refined how numeric facts work. The Claim-First Architecture proposal described making all claims first-class data objects. This document tackled the missing piece between those two: the property ontology — what types of claims exist, what constraints they have, and how they connect to entities. The KB system is the realized implementation of that ontology.
The rest of this document preserves the original design thinking and worked examples, annotated with what KB now implements versus what remains aspirational.
What the KB System Implements
The KB system (packages/kb/) now covers most knowledge graph concepts that were originally gaps:
| Knowledge graph concept | KB implementation | Status |
|---|---|---|
| Entity classes (types) | 19 type schemas in packages/kb/data/schemas/ (organization, person, concept, ai-model, etc.) | Implemented |
| Typed properties | 95 properties in packages/kb/data/properties.yaml with dataType, category, appliesTo | Implemented |
| Domain constraints | appliesTo: [organization] on each property | Implemented |
| Temporal qualifiers | asOf and validEnd fields on individual facts | Implemented |
| Source attribution | source URL, sourceResource, sourceQuote, notes per fact | Implemented |
| Entity relationships | ref/refs fact types with inverseId auto-computation | Implemented |
| Non-numeric facts | text, date, boolean, ref, refs, range, min, json value types | Implemented |
| Record collections | Structured sub-data: funding-rounds, key-people, products, model-releases, board-members | Implemented |
| Confidence / verification | Not in KB; source/notes fields provide partial coverage | Not yet implemented |
| Claim-type taxonomy | Not in KB; no factual/consensus/analytical/speculative distinction | Not yet implemented |
The remaining gap is the claim-level metadata proposed in Layer 4 below: confidence levels, verification status, claim-type taxonomy, and explicit reasoning chains for analytical claims. The KB system captures what is true with sources, but does not yet capture how confident we are or what type of assertion this is.
Design Principles
1. Schema.org over Wikidata
Wikidata's property constraints are advisory warnings that flag violations. Schema.org's approach is softer: domainIncludes means "this property is typically used with these types," not "using it elsewhere is wrong."
For our system — where LLM agents do much of the authoring — Schema.org's descriptive approach is better. Properties should suggest what's expected, not hard-block. This matches how applicableTo on measures already works.
2. Small vocabulary, enforced at write time
Wikidata has ~13,200 properties. Schema.org has ~1,500. SNOMED (biomedical) has 65 relationship types for millions of concepts.
For ~700 entities across 24 types, the research suggests 80-150 total properties. But most value comes from a small core — Wikidata's top 100 properties cover the vast majority of statements. We should start with 30-50 properties and grow based on actual data needs.
The critical constraint from Wikidata's experience: properties are hard to remove once data uses them. Be thoughtful about additions, but don't let that paralyze progress.
3. Qualifiers over property explosion
Without qualifiers, you get property proliferation:
# BAD: separate properties for temporal variants
kalshi-valuation-series-d: $5B
kalshi-valuation-series-e: $11B
# GOOD: one property with qualifiers
- property: valuation
value: 11000000000
qualifiers:
pointInTime: 2025-12
fundingRound: Series E
source: resource-abc
The asOf field on existing facts is already a qualifier. Extending this pattern to all claims is natural.
4. Resources as the authoring unit
A single TechCrunch article produces 8-12 claims. A single annual report might produce 50+. The resource \to batch of claims relationship is the natural unit for authoring and review. Claims should track which resource they were extracted from, enabling workflows like "show me everything we learned from this source."
Proposed Property Ontology
Layer 1: Universal Properties (apply to all entity types)
Status: Largely implemented in KB. Each KB entity file (packages/kb/data/things/*.yaml) has a thing: header with id, name, type, numericId, and aliases. Many of the properties below now exist as KB properties:
| Property | Value type | Description | KB status |
|---|---|---|---|
name | string | Primary name | Implemented (thing.name) |
description | string | Short description | Implemented (MDX frontmatter description) |
founded-date | date | When created/founded | Implemented (KB property, appliesTo: [organization, funder]) |
end-date | date | When dissolved/died | Not yet a KB property |
status | enum | active, inactive, dissolved, deceased | Not yet a KB property |
official-url | url | Primary website | Implemented (KB property website) |
wikipedia-url | url | Wikipedia page | Not yet a KB property |
wikidata-id | string | Wikidata Q-identifier | Not yet a KB property |
parent-org | ref | Parent org, broader concept | Not yet a KB property |
Layer 2: Type-Specific Properties
Status: Implemented in KB. The 19 schema files in packages/kb/data/schemas/ define required and recommended properties per entity type. For example, organization.yaml recommends founded-date, headquarters, revenue, valuation, headcount, legal-structure, total-funding and defines record collections for funding-rounds, key-people, products, etc.
Organization properties (KB implementation):
| Property | KB dataType | KB status | Notes |
|---|---|---|---|
headquarters | text | Implemented | Location string |
founded-by | refs | Implemented | With inverseId: founder-of |
legal-structure | text | Implemented | nonprofit, llc, corp, etc. |
headcount | number (temporal) | Implemented | With display config |
user-count | number (temporal) | Implemented | |
revenue | number (temporal) | Implemented | Display: divisor 1e9, prefix $, suffix B |
valuation | number (temporal) | Implemented | Display: divisor 1e9, prefix $, suffix B |
total-funding | number (temporal) | Implemented | |
market-share | number (temporal) | Implemented | Display: suffix % |
gross-margin | number (temporal) | Implemented | Display: suffix % |
Person properties (KB implementation):
| Property | KB dataType | KB status | Notes |
|---|---|---|---|
born-year | number | Implemented | |
education | text | Implemented | |
employed-by | ref (temporal) | Implemented | With inverseId: employer-of |
role | text (temporal) | Implemented | Job title at primary employer |
notable-for | text | Implemented | One-line summary |
net-worth | number (temporal) | Implemented | |
founder-of | refs (computed) | Implemented | Auto-computed from founded-by inverse |
Risk/concept properties (from original proposal):
| Property | KB dataType | KB status | Notes |
|---|---|---|---|
riskCategory | enum | Not yet a KB property | |
affectedBy | refs | Not yet a KB property | |
mitigatedBy | refs | Not yet a KB property | |
estimatedProbability | number | Not yet a KB property |
Risk and concept schemas exist (packages/kb/data/schemas/risk.yaml, concept.yaml) but have fewer defined properties than organization and person schemas. These remain an area for growth.
Layer 3: Relationship Properties (cross-entity claims)
Status: Partially implemented in KB. The KB system uses ref and refs data types for entity relationships, with inverseId declarations enabling automatic inverse computation. Many of the proposed relationship properties now exist:
| Property | KB status | KB inverseId | Notes |
|---|---|---|---|
employed-by | Implemented | employer-of | ref, temporal |
founded-by | Implemented | founder-of | refs |
parent-org | Not yet | — | |
subsidiary-of | Not yet | — | |
invested-in | Not yet | — | Partially covered by funding-rounds record collection |
partnered-with | Not yet | — | |
regulated-by | Not yet | — | |
sued-by | Not yet | — | |
competes-with | Not yet | — | |
addresses | Not yet | — | For approach/policy to risk |
implemented-by | Not yet | — | For approach to org/project |
The KB's record collections (e.g., funding-rounds with lead_investor as a ref field) handle some relationship patterns that don't map cleanly to simple property-value facts. For example, a funding round involves an amount, date, valuation, and lead investor — this is better represented as a structured record than as separate investedIn facts.
Layer 4: Claim Types and Confidence
Status: Not yet implemented in KB. This layer remains aspirational. The KB system's source, sourceQuote, and notes fields provide basic provenance, but the structured confidence taxonomy below is not part of the current KB schema. This is the most significant gap between the proposal and the implementation.
Following the taxonomy from the Claim-First Architecture proposal:
| Claim type | Verification strategy | Confidence levels | Example |
|---|---|---|---|
| factual | Check against source | verified, partial, unverified | "Kalshi was founded in 2018" |
| numeric | Check value + source; link to fact store | verified, estimated | "$11B valuation in Dec 2025" |
| consensus | Multiple independent sources | strong, moderate, weak | "Leading regulated US prediction market" |
| analytical | Check inference from supporting claims | supported, speculative | "Fee structure incentivizes liquidity" |
| speculative | Cannot verify — flag uncertainty | plausible, uncertain, contested | "Regulatory moat may narrow" |
| relational | Cross-entity verification | verified, partial | "Growth rate exceeded Polymarket's" |
Worked Example: Kalshi Page Decomposed
Note (March 2026): This worked example was created before the KB system existed and uses a hypothetical claim schema (
data/claims/kalshi.yaml) that was never implemented. Many of the properties used below (e.g.,foundedBy,fundedBy,headquarters) now exist as KB properties inpackages/kb/data/properties.yaml, and entity data like funding rounds are stored as structured items in KB entity files (packages/kb/data/things/kalshi.yaml). The claim-level metadata (confidence, type taxonomy, reasoning chains) shown here remains aspirational — the KB system does not yet track these. The example is preserved because it demonstrates the analytical decomposition process and the value of structured claims.
The Kalshi wiki page contains ~332 lines of prose with 87 footnoted citations. Below is how its content would decompose into structured claims using the property ontology above.
Entity Profile (Structured Properties)
These are property-value pairs on the Kalshi entity itself — the equivalent of a Wikidata item's core statements:
# Hypothetical format — not implemented. In the KB system, entity data lives
# in packages/kb/data/things/kalshi.yaml using the KB fact schema.
entity: kalshi
entityType: organization
subcategory: architecture
properties:
name: "Kalshi"
previousName: "Kownig"
foundedDate: 2018
headquarters: "New York, NY"
officialUrl: "https://kalshi.com"
wikipediaUrl: "https://en.wikipedia.org/wiki/Kalshi"
wikidataId: "Q114586938"
industry: [prediction-markets, financial-services, sports-betting]
legalForm: corporation
revenueModel: "Transaction fees (maker-taker: 0.07%-7% on takers)"
regulatoryStatus:
value: "CFTC-licensed Designated Contract Market (DCM)"
qualifiers:
grantedDate: 2020-11
jurisdiction: federal
source: sigma-world-timeline
marketCategories:
- politics
- sports
- economics
- climate
- finance-crypto
- culture
- technology-science
founder:
- entity: tarek-mansour
role: CEO
- entity: luana-lopes-lara
role: COO
# These link to numeric facts (existing system)
factRefs:
valuation: { factId: tbd, asOf: 2025-12, value: 11000000000 }
totalFunding: { factId: tbd, asOf: 2025-12, value: 1500000000 }
tradingVolume: { factId: tbd, asOf: 2025-12, value: "40-50B annualized" }
Factual Claims (52 claims)
These are verifiable assertions extracted from the prose. Each links to its source and can be independently verified, updated, or superseded.
# Hypothetical claim schema — not implemented as shown. The KB system stores
# facts per-entity in packages/kb/data/things/kalshi.yaml without claim-level
# metadata like confidence, type taxonomy, or reasoning chains.
entity: kalshi
claims:
# === FOUNDING & HISTORY (8 claims) ===
- id: c-kalshi-001
property: foundedBy
text: "Founded in 2018 by MIT graduates Tarek Mansour and Luana Lopes Lara"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
quote: "Founded in 2018 by Tarek Mansour and Luana Lopes Lara"
support: full
temporal: { type: historical, date: 2018 }
entityRefs:
- { id: kalshi, role: subject }
- { id: tarek-mansour, role: founder }
- { id: luana-lopes-lara, role: founder }
cluster: kalshi-founding
- id: c-kalshi-002
property: educatedAt
text: "Founders met at MIT studying computer science, mathematics, and electrical engineering"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: historical }
entityRefs:
- { id: tarek-mansour, role: subject }
- { id: luana-lopes-lara, role: subject }
cluster: kalshi-founding
- id: c-kalshi-003
property: employedAt
text: "Founders completed internships at Goldman Sachs, Palantir, Five Rings Capital, and Citadel"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: historical }
entityRefs:
- { id: tarek-mansour, role: subject }
- { id: luana-lopes-lara, role: subject }
cluster: kalshi-founding
- id: c-kalshi-004
property: employedAt
text: "Mansour worked as a quantitative trader at Citadel (May 2018 to May 2019)"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: historical, startDate: 2018-05, endDate: 2019-05 }
entityRefs:
- { id: tarek-mansour, role: subject }
cluster: kalshi-founding
- id: c-kalshi-005
text: "Company was initially named 'Kownig'"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: historical, date: 2018 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-founding
- id: c-kalshi-006
text: "Joined Y Combinator Winter 2019 batch"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: historical, date: 2019 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-founding
- id: c-kalshi-007
text: "Founders motivated by gaps in hedging event outcomes, particularly after observing Brexit market shocks"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: historical }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-founding
- id: c-kalshi-008
text: "Early years focused on regulatory compliance — built exchange, broker, and surveillance systems before acquiring users"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: historical }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-founding
# === REGULATORY MILESTONES (7 claims) ===
- id: c-kalshi-010
property: regulatedBy
text: "CFTC approved Kalshi as first designated contract market (DCM) for event contracts"
type: factual
confidence: verified
sources:
- resourceId: sigma-world-timeline
support: full
temporal: { type: historical, date: 2020-11 }
entityRefs:
- { id: kalshi, role: subject }
cluster: kalshi-regulation
- id: c-kalshi-011
text: "Official platform launch enabling trades on economic events"
type: factual
confidence: verified
sources:
- resourceId: sigma-world-timeline
support: full
temporal: { type: historical, date: 2021-07 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-regulation
- id: c-kalshi-012
text: "CFTC began examining political event contracts proposal"
type: factual
confidence: verified
sources:
- resourceId: sigma-world-timeline
support: full
temporal: { type: historical, date: 2022-08 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-regulation
- id: c-kalshi-013
text: "CFTC rejected political contracts, citing gambling risks"
type: factual
confidence: verified
sources:
- resourceId: sigma-world-timeline
support: full
temporal: { type: historical, date: 2023-09 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-regulation
- id: c-kalshi-014
property: suedBy
text: "Kalshi sued CFTC for regulatory overreach"
type: factual
confidence: verified
sources:
- resourceId: sigma-world-timeline
support: full
temporal: { type: historical, date: 2023-11 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-regulation
- id: c-kalshi-015
text: "Federal court ruled unanimously in Kalshi's favor, allowing election contracts to resume"
type: factual
confidence: verified
sources:
- resourceId: sigma-world-timeline
support: full
temporal: { type: historical, date: 2024-09 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-regulation
- id: c-kalshi-016
text: "Robinhood launched prediction hub on Kalshi infrastructure"
type: factual
confidence: verified
sources:
- resourceId: sigma-world-timeline
support: full
temporal: { type: historical, date: 2024-11 }
entityRefs:
- { id: kalshi, role: subject }
cluster: kalshi-partnerships
# === FUNDING ROUNDS (7 claims) ===
- id: c-kalshi-020
property: fundedBy
text: "Raised $6.1 million in seed round"
type: numeric
confidence: verified
value: 6100000
measure: funding-round
sources:
- resourceId: nasdaq-private-market-kalshi
support: full
temporal: { type: historical, date: 2020-03 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-funding
- id: c-kalshi-021
property: fundedBy
text: "Raised $30 million Series A at $120 million valuation"
type: numeric
confidence: verified
value: 30000000
measure: funding-round
qualifiers:
valuation: 120000000
round: "Series A"
sources:
- resourceId: cbinsights-kalshi
support: full
temporal: { type: historical, date: 2020-12 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-funding
- id: c-kalshi-022
property: fundedBy
text: "Raised $60 million Series B"
type: numeric
confidence: verified
value: 60000000
measure: funding-round
qualifiers:
round: "Series B"
sources:
- resourceId: nasdaq-private-market-kalshi
support: full
temporal: { type: historical, date: 2025-06 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-funding
- id: c-kalshi-023
property: fundedBy
text: "Raised $125 million Series C"
type: numeric
confidence: verified
value: 125000000
measure: funding-round
qualifiers:
round: "Series C"
sources:
- resourceId: nasdaq-private-market-kalshi
support: full
temporal: { type: historical, date: 2025-06 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-funding
- id: c-kalshi-024
property: fundedBy
text: "Raised $300 million Series D at $5 billion valuation, led by Sequoia Capital and Andreessen Horowitz"
type: numeric
confidence: verified
value: 300000000
measure: funding-round
qualifiers:
valuation: 5000000000
round: "Series D"
leadInvestor: [sequoia-capital, andreessen-horowitz]
sources:
- resourceId: equityzen-kalshi
support: full
temporal: { type: historical, date: 2025-10 }
entityRefs:
- { id: kalshi, role: subject }
- { id: sequoia-capital, role: investor }
- { id: andreessen-horowitz, role: investor }
cluster: kalshi-funding
- id: c-kalshi-025
property: fundedBy
text: "Raised $1 billion Series E at $11 billion valuation"
type: numeric
confidence: verified
value: 1000000000
measure: funding-round
qualifiers:
valuation: 11000000000
round: "Series E"
leadInvestor: [paradigm]
participants: [sequoia-capital, andreessen-horowitz, meritech-capital,
ivp, ark-invest, anthos-capital, capitalg, y-combinator]
sources:
- resourceId: kalshi-series-e-announcement
quote: "$11 billion valuation"
support: full
temporal: { type: historical, date: 2025-12 }
entityRefs:
- { id: kalshi, role: subject }
cluster: kalshi-funding
- id: c-kalshi-026
text: "Total funding raised exceeds $1.5 billion across nine rounds"
type: numeric
confidence: verified
value: 1500000000
measure: total-funding
sources:
- resourceId: cbinsights-kalshi
support: full
temporal: { type: point-in-time, asOf: 2025-12 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-funding
# === PLATFORM OPERATIONS (6 claims) ===
- id: c-kalshi-030
text: "Operates as peer-to-peer marketplace using central limit order book (CLOB)"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-platform
- id: c-kalshi-031
text: "Binary yes/no contracts on verifiable real-world events; price reflects market probability"
type: factual
confidence: verified
sources:
- resourceId: kalshi-how-markets-work
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-platform
- id: c-kalshi-032
text: "Maker-taker fee structure: makers pay no fees, takers pay 0.07%-7%"
type: factual
confidence: verified
sources:
- resourceId: sacra-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-platform
- id: c-kalshi-033
text: "Real-time sentiment data valued at over $50,000 annually by institutional clients"
type: numeric
confidence: partial
sources:
- resourceId: sacra-kalshi
support: partial # Sacra estimate, not Kalshi-confirmed
temporal: { type: point-in-time, asOf: 2025 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-platform
- id: c-kalshi-034
text: "Trading volumes exceeding $1 billion weekly by late 2025"
type: numeric
confidence: verified
sources:
- resourceId: kalshi-series-e-announcement
support: full
temporal: { type: point-in-time, asOf: 2025-12 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-scale
- id: c-kalshi-035
text: "Sports betting accounts for approximately 90% of trading volume"
type: numeric
confidence: verified
sources:
- resourceId: natlawreview-kalshi-records
support: full
temporal: { type: point-in-time, asOf: 2025-12 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-scale
# === PARTNERSHIPS (10 claims) ===
- id: c-kalshi-040
property: partneredWith
text: "Multiyear partnership with NHL including official data, logos, and broadcast signage"
type: factual
confidence: verified
qualifiers:
partnershipType: sports-data
startDate: 2025
sources:
- resourceId: nhl-kalshi-partnership
support: full
temporal: { type: ongoing }
entityRefs:
- { id: kalshi, role: subject }
cluster: kalshi-partnerships
- id: c-kalshi-041
property: partneredWith
text: "First brand partnership between a North American sports team and a prediction market (Chicago Blackhawks)"
type: factual
confidence: verified
sources:
- resourceId: kalshi-blackhawks
support: full
temporal: { type: historical, date: 2025 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-042
property: partneredWith
text: "Three-year strategic partnership with STATSCORE for premium sports data"
type: factual
confidence: verified
qualifiers:
partnershipType: data-provider
duration: "3 years"
sources:
- resourceId: statscore-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-043
property: partneredWith
text: "StockX partnership for sneaker/apparel/collectible event contracts"
type: factual
confidence: verified
sources:
- resourceId: kalshi-stockx
support: full
temporal: { type: historical, date: 2025-11 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-044
property: partneredWith
text: "Barchart partnership providing prediction data to 32 million investors"
type: factual
confidence: verified
sources:
- resourceId: kalshi-barchart
support: full
temporal: { type: historical, date: 2025-11 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-045
property: partneredWith
text: "Robinhood prediction hub built on Kalshi infrastructure; accounts for over half of Kalshi's betting volume"
type: factual
confidence: partial
sources:
- resourceId: cryptopolitan-kalshi-2026
support: partial # "reportedly" qualifier in source
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-046
property: partneredWith
text: "CNN integration with real-time prediction market news ticker"
type: factual
confidence: verified
sources:
- resourceId: popular-info-casinofication
support: full
temporal: { type: historical, date: 2025 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-047
property: partneredWith
text: "CNBC partnership for 2026 editorial coverage integration"
type: factual
confidence: verified
sources:
- resourceId: popular-info-casinofication
support: full
temporal: { type: historical, date: 2026 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-048
property: partneredWith
text: "Coinbase Custody partnership for USDC-powered event trading"
type: factual
confidence: verified
sources:
- resourceId: kalshi-coinbase-custody
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
- id: c-kalshi-049
text: "Integration with TRON network, Phantom crypto wallet; serving as in-house prediction market for Coinbase"
type: factual
confidence: verified
sources:
- resourceId: natlawreview-kalshi-records
support: full
temporal: { type: historical, date: 2025 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-partnerships
# === COMPETITION (4 claims) ===
- id: c-kalshi-050
property: competesWith
text: "Kalshi and Polymarket form a duopoly, collectively over $44 billion trading volume in 2025"
type: relational
confidence: verified
sources:
- resourceId: cryptopolitan-kalshi-2026
support: full
temporal: { type: point-in-time, asOf: 2025 }
entityRefs:
- { id: kalshi, role: subject }
- { id: polymarket, role: competitor }
cluster: kalshi-competition
- id: c-kalshi-051
text: "Combined daily volume record of $799 million across both platforms (Jan 17, 2026)"
type: numeric
confidence: verified
sources:
- resourceId: thestreet-prediction-markets
support: full
temporal: { type: historical, date: 2026-01-17 }
entityRefs:
- { id: kalshi, role: subject }
- { id: polymarket, role: mentioned }
cluster: kalshi-competition
- id: c-kalshi-052
text: "Polymarket operates using cryptocurrency and has faced restrictions on US users"
type: factual
confidence: verified
sources:
- resourceId: contrary-research-kalshi
support: full
temporal: { type: ongoing }
entityRefs:
- { id: polymarket, role: subject }
- { id: kalshi, role: implied_comparison }
cluster: kalshi-competition
- id: c-kalshi-053
text: "Kalshi's competitive advantages: legal clarity for US users, segregated customer funds with deposit insurance, institutional partnerships"
type: analytical
confidence: partial
supportingClaims: [c-kalshi-010, c-kalshi-040, c-kalshi-045]
sources:
- resourceId: kalshi-about
support: partial
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-competition
# === REGULATORY CHALLENGES (7 claims) ===
- id: c-kalshi-060
property: suedBy
text: "New York State Gaming Commission issued cease and desist for offering sports betting without state license"
type: factual
confidence: verified
sources:
- resourceId: cbs-kalshi-regulation
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-legal
- id: c-kalshi-061
property: suedBy
text: "Federal judge ruled Kalshi must stop offering prediction contracts in Nevada"
type: factual
confidence: verified
sources:
- resourceId: nevada-independent-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-legal
- id: c-kalshi-062
text: "Nearly two dozen states and tribal gaming authorities have filed federal lawsuits"
type: factual
confidence: verified
sources:
- resourceId: nevada-independent-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-legal
- id: c-kalshi-063
text: "US District Judge Andrew Gordon ruled event contracts on sporting outcomes do not fall within CFTC's exclusive jurisdiction"
type: factual
confidence: verified
sources:
- resourceId: nevada-independent-kalshi
quote: "do not fall within the CFTC's exclusive jurisdiction"
support: full
temporal: { type: historical }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-legal
- id: c-kalshi-064
text: "Native American gaming tribes filed briefs calling activities 'brazenly illegal'"
type: factual
confidence: verified
sources:
- resourceId: nexteventhorizon-kalshi-nfl
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-legal
- id: c-kalshi-065
text: "Class action lawsuit claiming market makers place customers at disadvantage"
type: factual
confidence: verified
sources:
- resourceId: igaming-kalshi-class-action
support: full
temporal: { type: historical, date: 2025 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-legal
- id: c-kalshi-066
text: "Separate New York class action alleging illegal sports betting and deceptive practices"
type: factual
confidence: verified
sources:
- resourceId: wikipedia-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-legal
# === OPERATIONAL CONCERNS (5 claims) ===
- id: c-kalshi-070
text: "Incorrectly graded multiple NFL win totals markets; refunded original amounts without paying winners"
type: factual
confidence: verified
sources:
- resourceId: nexteventhorizon-kalshi-nfl
support: full
temporal: { type: historical }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-operations
- id: c-kalshi-071
text: "Minimal age verification mechanisms despite advertising as legal for 18+"
type: factual
confidence: verified
sources:
- resourceId: gamblingharm-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-operations
- id: c-kalshi-072
text: "No prominent messaging or resources for users experiencing gambling problems"
type: factual
confidence: verified
sources:
- resourceId: gamblingharm-kalshi
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-operations
- id: c-kalshi-073
text: "Insider trading violations do not currently carry criminal penalties on regulated prediction platforms"
type: factual
confidence: verified
sources:
- resourceId: axios-kalshi-insider-trading
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-operations
- id: c-kalshi-074
text: "$30,000 bet on Polymarket yielded $436,760 from suspected insider information (Maduro capture contract)"
type: factual
confidence: verified
sources:
- resourceId: axios-kalshi-insider-trading
support: full
temporal: { type: historical }
entityRefs:
- { id: polymarket, role: subject }
- { id: kalshi, role: context }
cluster: kalshi-operations
# === AI SAFETY RELEVANCE (3 claims) ===
- id: c-kalshi-080
text: "AI Research Pause Market (KXAI PAUSE-27) assigns 12% probability to any major AI company pausing research for safety before January 2027"
type: numeric
confidence: verified
value: 0.12
sources:
- resourceId: kalshi-ai-pause-market
support: full
temporal: { type: point-in-time, asOf: 2026-02 }
entityRefs:
- { id: kalshi, role: platform }
- { id: anthropic, role: mentioned }
- { id: openai, role: mentioned }
cluster: kalshi-ai-safety
- id: c-kalshi-081
text: "Offers contracts on whether AI regulation will become federal law by January 2027 (KXAI LEGISLATION-27)"
type: factual
confidence: verified
sources:
- resourceId: kalshi-ai-legislation-market
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: platform }]
cluster: kalshi-ai-safety
- id: c-kalshi-082
text: "Low probability assignments suggest trader skepticism about imminent AI safety pauses"
type: analytical
confidence: partial
supportingClaims: [c-kalshi-080]
sources: []
temporal: { type: point-in-time, asOf: 2026-02 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-ai-safety
# === CONSENSUS & ANALYTICAL CLAIMS (5 claims) ===
- id: c-kalshi-090
text: "Kalshi's mission is to democratize finance by enabling everyday users to capitalize on opinions and hedge personal risks"
type: factual
confidence: verified
sources:
- resourceId: kalshi-about
support: full
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
- id: c-kalshi-091
text: "EA Forum communities give generally positive but cautious reception; recognize legal milestones but note attention costs and limited EA-relevant coverage"
type: consensus
confidence: moderate
consensusBreadth: 3
sources:
- resourceId: ea-forum-prediction-markets-corporate
support: partial
- resourceId: ea-forum-predicting-for-good
support: partial
- resourceId: kalshi-harnessing-prediction-markets
support: partial
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-reception
- id: c-kalshi-092
text: "Fee structure designed to incentivize liquidity provision while remaining competitive with traditional sportsbook vigorish"
type: analytical
confidence: partial
supportingClaims: [c-kalshi-032]
reasoning: "Zero maker fees + low taker fees (0.07-7%) is typical exchange design for liquidity. Fee range is lower than typical sportsbook vig (5-10%)."
sources: []
temporal: { type: ongoing }
entityRefs: [{ id: kalshi, role: subject }]
- id: c-kalshi-093
text: "Prediction markets systematically fail on fat-tailed, improbable events like revolutions"
type: consensus
confidence: moderate
consensusBreadth: 2
sources:
- resourceId: kalshi-harnessing-prediction-markets
support: partial
temporal: { type: ongoing }
entityRefs:
- { id: kalshi, role: context }
- { id: prediction-markets, role: subject }
- { id: philip-tetlock, role: mentioned }
cluster: kalshi-reception
- id: c-kalshi-094
text: "5.5x valuation increase from Series D to Series E in just six months"
type: numeric
confidence: verified
value: 5.5
supportingClaims: [c-kalshi-024, c-kalshi-025]
sources:
- resourceId: signalhub-kalshi
support: full
temporal: { type: historical, date: 2025-12 }
entityRefs: [{ id: kalshi, role: subject }]
cluster: kalshi-funding
Summary Statistics
| Category | Claim count | Types |
|---|---|---|
| Founding & history | 8 | 8 factual |
| Regulatory milestones | 7 | 7 factual |
| Funding rounds | 7 | 6 numeric, 1 factual |
| Platform operations | 6 | 4 factual, 2 numeric |
| Partnerships | 10 | 10 factual |
| Competition | 4 | 1 relational, 1 numeric, 1 factual, 1 analytical |
| Legal challenges | 7 | 7 factual |
| Operational concerns | 5 | 5 factual |
| AI safety relevance | 3 | 1 numeric, 1 factual, 1 analytical |
| Consensus & analytical | 5 | 2 consensus, 2 analytical, 1 numeric |
| Total | 62 | 48 factual, 7 numeric, 2 consensus, 3 analytical, 1 relational, 1 numeric-derived |
Clusters Used
| Cluster | Claims | What it groups |
|---|---|---|
kalshi-founding | 8 | Origin story, founders, early decisions |
kalshi-regulation | 7 | CFTC approval and political contracts saga |
kalshi-funding | 8 | All funding rounds + totals |
kalshi-platform | 4 | How the product works |
kalshi-scale | 2 | Volume and usage metrics |
kalshi-partnerships | 10 | All partnership claims |
kalshi-competition | 4 | Polymarket comparison, market dynamics |
kalshi-legal | 7 | State lawsuits, tribal opposition, class actions |
kalshi-operations | 5 | Grading errors, consumer protection, insider trading |
kalshi-ai-safety | 3 | AI-related prediction markets |
kalshi-reception | 2 | EA/forecasting community views |
Properties Used
| Property | Usage count | Domain constraint |
|---|---|---|
partneredWith | 9 | organization |
fundedBy | 6 | organization |
suedBy | 3 | organization, person |
employedAt | 2 | person |
foundedBy | 1 | organization |
educatedAt | 1 | person |
regulatedBy | 1 | organization |
competesWith | 1 | organization |
| (no property — free claim) | 38 | — |
This reveals something important: 24 of 62 claims use typed relationship properties (39%), while 38 are free-text claims without a formal property. This is expected and fine — not every assertion maps cleanly to a property. The free claims still benefit from structured metadata (source, confidence, temporality, entity refs, cluster).
Observations from the Kalshi Exercise
What decomposition reveals
-
Source concentration. Of 62 claims, 12 come from a single source (Contrary Research). Processing that one resource yields ~20% of all claims. This validates the resource-centric authoring workflow.
-
Temporal diversity matters. Claims span historical (founding, milestones), point-in-time (valuations, volumes), and ongoing (platform operations, legal status). Without temporal typing, a system can't distinguish "founded in 2018" (never changes) from "90% sports volume" (changes quarterly).
-
Cross-entity claims are high-value. The competition claims (c-kalshi-050 through c-kalshi-053) reference both Kalshi and Polymarket. These are exactly the claims that should appear on both entity pages and on a prediction markets comparison page — the multi-page reuse case.
-
Analytical claims need explicit reasoning chains. Claims c-kalshi-053, c-kalshi-082, and c-kalshi-092 are the wiki's own analysis. Making
supportingClaimsexplicit prevents these from being treated as sourced facts and enables validation ("do the supporting claims actually support this inference?"). -
Clusters emerge naturally. The 11 clusters correspond roughly to the page's section structure but aren't identical. The
kalshi-partnershipscluster spans what's actually two page sections (Sports + Data/Technology), whilekalshi-regulationspans three sections. Clusters reflect topical groupings; page sections reflect editorial choices about presentation. -
Most claims don't need relationship properties. Only 39% of claims use typed properties. The rest are free assertions with structured metadata. This means the property vocabulary can stay small — you don't need a property for every possible assertion.
What this enables
With these 62 claims structured, you could:
- Generate a comparison page — Pull all
competesWithclaims plus numeric facts for Kalshi and Polymarket, compose a comparison view automatically. - Detect staleness — The trading volume and AI market probability claims have
temporal.asOf: 2025-12. If it's now March 2026, flag them for update. - Track source quality — 12 claims from Contrary Research, 7 from Sigma World, 6 from Kalshi's own announcements. If Contrary Research publishes an update, you know which claims to re-verify.
- Build entity timelines — Filter claims by
temporal.date, sort chronologically, and you have a timeline view without any additional authoring. - Cross-entity consistency — If Polymarket's page says "$44B combined volume" but the Kalshi page says "$44B," the system can surface that these reference the same underlying fact via the shared cluster.
- Surface analytical gaps — The
kalshi-ai-safetycluster has only 3 claims, all fairly shallow. A dashboard could flag "AI safety relevance is under-developed for this entity."
Property Ontology Design Decisions
Decision 1: How many properties to start with?
Resolved. The KB system launched with ~100 properties and continues to grow. The original recommendation of 25-30 was conservative; in practice, the combination of financial metrics, biographical data, organizational structure, and AI-specific properties required a larger vocabulary. This aligns with the document's own research note that "80-150 total properties" would be needed.
Originally recommended initial vocabulary (most now exist in KB):
| # | Property | Domain | Range |
|---|---|---|---|
| 1 | foundedBy | organization | person |
| 2 | ceo | organization | person |
| 3 | employedAt | person | organization |
| 4 | educatedAt | person | organization |
| 5 | investedIn | organization, person | organization |
| 6 | fundedBy | organization | organization, person |
| 7 | acquiredBy | organization | organization |
| 8 | partneredWith | organization | organization |
| 9 | regulatedBy | organization | organization |
| 10 | suedBy | organization | organization, person |
| 11 | competesWith | organization | organization |
| 12 | parentOrg | organization | organization |
| 13 | subsidiaryOf | organization | organization |
| 14 | addresses | approach, policy, project | risk, risk-factor |
| 15 | mitigatedBy | risk | approach, policy |
| 16 | implementedBy | approach | organization, project |
| 17 | critiquedBy | any | person, argument |
| 18 | endorsedBy | any | person, organization |
| 19 | developedModel | organization | model |
| 20 | authoredBy | resource, argument | person |
| 21 | advisorTo | person | organization |
| 22 | boardMemberOf | person | organization |
| 23 | previouslyAt | person | organization |
| 24 | causedBy | event, risk | any |
| 25 | ledTo | event | event |
Decision 2: Where do properties live?
Resolved. Properties live in packages/kb/data/properties.yaml. The format evolved from what was proposed but follows the same principles:
# packages/kb/data/properties.yaml (actual format)
properties:
founded-by:
name: Founded By
description: "Person(s) who founded this organization"
dataType: refs
category: people
appliesTo: [organization]
inverseId: founder-of
inverseName: Founded
employed-by:
name: Employed By
description: "Organization this person currently or historically works for"
dataType: ref
category: people
temporal: true
appliesTo: [person]
inverseId: employer-of
inverseName: Employs
revenue:
name: Revenue
description: "Annualized run-rate revenue (ARR) or trailing twelve-month revenue"
dataType: number
unit: USD
category: financial
temporal: true
appliesTo: [organization]
display:
divisor: 1e9
prefix: "$"
suffix: "B"
Key differences from the original proposal: kebab-case property IDs instead of camelCase, appliesTo instead of domainIncludes/rangeIncludes, inverseId instead of inverse, and display config for rendering. The rangeIncludes concept is implicit in the dataType (ref/refs targets are validated against entity existence, not type-constrained).
Decision 3: How do claims and facts coexist?
Partially resolved by KB. In the KB system, all structured data lives in a single place: entity files in packages/kb/data/things/. Facts are stored directly on entities with source attribution:
# packages/kb/data/things/kalshi.yaml (actual KB format)
facts:
- id: f_kalshi_val_2025
property: valuation
value: 11e9
asOf: 2025-12
source: https://example.com/kalshi-series-e
sourceQuote: "$11 billion valuation"
notes: "Series E round led by Paradigm"
This eliminates the dual-store problem. The original proposal's distinction between "fact store" (display, timeseries) and "claim store" (provenance, confidence) collapsed into a single KB fact with both display config (from properties.yaml) and provenance (source fields on each fact). The claim-level metadata (confidence, type taxonomy) proposed in Layer 4 is not yet part of KB facts.
Decision 4: Advisory constraints or hard enforcement?
Resolved: Advisory with validation. The KB system follows the Schema.org-style approach recommended here. appliesTo constraints in properties.yaml are enforced by KB validation rules — using a property on an entity type it doesn't apply to generates a validation warning — but the system does not hard-block fact creation. The 23 validation rules cover integrity (valid property IDs, valid refs), consistency (temporal flags match property declarations), and quality (source coverage) without being overly restrictive.
Implementation Path
Phase 1: Property vocabulary (data layer only) — COMPLETE
Create-- Done:data/properties.yamlwith the initial 25 propertiespackages/kb/data/properties.yamlwith 95 propertiesAdd a-- Done: KB validation inPropertyZod schema todata/schema.tspackages/kb/Validate property references in build-data-- Done: 23 validation rules
Phase 2: Entity data population — COMPLETE
Pilot with a few entities-- Done: 360+ entities inpackages/kb/data/things/Build dashboards-- Done: Fact Dashboard (E898) and related citation dashboardsDefine schemas per entity type-- Done: 19 schemas inpackages/kb/data/schemas/
Phase 3: Extraction and display tooling — COMPLETE
Integrate KB facts into wiki pages-- Done:<KBF>,<KBFactValue>, and<Calc>MDX componentsBuild fact extraction pipeline-- Done:crux footnotescommands for migrating claims to KB factsValidate property constraints and source coverage-- Done: KB validation rules
Phase 4: Claim-level metadata — NOT STARTED
This is the remaining gap. The KB system captures what is true with sources, but does not yet support:
- Claim-type taxonomy (factual, consensus, analytical, speculative)
- Confidence levels (verified, partial, unverified)
- Explicit reasoning chains for analytical claims (
supportingClaims) - Verification status tracking
- Claim clustering for topical grouping
Open Questions
Some original questions have been resolved by KB; others remain open.
-
Claim ID format.Resolved: KB usesf_prefixed IDs (e.g.,f_qR5tY9wE1a) — 10-char alphanumeric, generated, collision-resistant. Thec-kalshi-001format from this proposal was never adopted. -
File granularity.Resolved: One file per entity inpackages/kb/data/things/. Large entities have many facts in one file, which works well in practice. -
Who maintains facts? Partially resolved. The
crux footnotespipeline can migrate prose claims to KB facts. The content improve pipeline (crux content improve) writes prose but does not yet update KB facts directly. KB facts and prose coexist, with<KBF>components bridging them in MDX. -
How do facts and prose stay in sync? The
<KBF>and<KBFactValue>MDX components render KB fact values inline in prose, so updating a KB fact automatically updates the displayed value. For prose that doesn't use these components, sync remains manual. -
Property hierarchy. Not yet implemented. KB properties have
categorygroupings (financial, people, biographical, organization, product, etc.) but no formal sub-property relationships. With 95 properties, some hierarchy may become useful for dashboard navigation. -
Claim-level metadata. (New) Should the KB system add confidence/verification fields to facts? The current source/notes fields are free-text. Adding structured confidence would enable the verification workflows described in Layer 4, but adds complexity to every fact entry.
Current Status Summary (March 2026)
| Proposal area | Status | Where it lives |
|---|---|---|
| Property vocabulary | Implemented (95 properties) | packages/kb/data/properties.yaml |
| Entity type schemas | Implemented (19 types) | packages/kb/data/schemas/*.yaml |
| Entity data files | Implemented (360+ entities) | packages/kb/data/things/*.yaml |
| Value types (number, text, date, ref, etc.) | Implemented (9 types) | KB fact schema |
| Temporal qualifiers | Implemented (asOf, validEnd) | KB fact fields |
| Source attribution | Implemented (source, sourceResource, sourceQuote, notes) | KB fact fields |
| Inverse relationships | Implemented (auto-computed from inverseId) | properties.yaml declarations |
| Record collections (funding-rounds, etc.) | Implemented | Entity type schemas |
| Display configuration | Implemented (divisor, prefix, suffix) | properties.yaml display config |
| Validation | Implemented (23 rules) | KB validation system |
| MDX integration | Implemented (<KBF>, <KBFactValue>, <Calc>) | MDX components |
| Fact extraction pipeline | Implemented | crux footnotes commands |
| Claim-type taxonomy | Not implemented | — |
| Confidence / verification levels | Not implemented | — |
Reasoning chains (supportingClaims) | Not implemented | — |
| Claim clustering | Not implemented | — |
The KB system has successfully realized the core data model proposed in this document. The remaining aspirational layer (claim-level metadata for confidence, verification, and reasoning) represents an evolution beyond what most knowledge graphs implement, and may be pursued if the use cases for it become clearer.