Automation Landscape
All automated processes in the longterm-wiki project, grouped by execution tier. Includes schedules, descriptions, current status, and a daily timeline showing when things run.
45
Total processes
38
Active
3
Disabled
4
Manual-only
14
GHA scheduled
9/10
GK tasks (active)
4
K8s services
All scheduled GitHub Actions respect the
AUTOMATION_PAUSED repository variable. Set it to true to pause all automation (e.g., during incident response). Resume with pnpm crux ci resume-actions.GitHub Actions -- Scheduled
| Name | Schedule | Description | Status |
|---|---|---|---|
| Database Backup | Daily 02:00 UTC | pg_dump to compressed artifact. Also callable as reusable workflow (e.g. pre-deploy). | Active |
| Wiki Server Data Export | Daily 04:00 UTC | Exports session logs, edit logs, and job data from wiki-server API to GHA artifacts (90-day retention). Safety net for PG data. | Active |
| Refresh Resources Snapshot | Daily 04:30 UTC | Fetches all resources from PG, uploads as artifact. Local dev fallback when wiki-server is unavailable. | Active |
| Auto-Rebase PRs | Every 4 hours | Rebases open PRs onto main to surface conflicts early. Skips PRs with working labels, recent activity, or in merge queue. | Active |
| Job Worker | Every 30 min + on-demand dispatch | Polls wiki-server job queue, claims and executes jobs. Types: ping, citation-verify, page-improve, page-create, batch-commit, auto-update-digest, claim-verification, resource-ingest, resource-enrich. | Active |
| Daily Data Validation | Daily 07:00 UTC | Runs all data validators (local YAML/MDX + server PG-backed) to catch drift in production data. | Active |
| Server & API Health | Twice daily: 08:00 + 20:00 UTC | Wiki-server availability, DB record counts, API smoke tests. Auto-creates/updates/closes GitHub issues under the 'wellness' label. | Active |
| Frontend & Data Health | Twice daily: 08:05 + 20:05 UTC | Frontend availability and data freshness checks. Staggered +5 min from server health to avoid duplicate issues. | Active |
| CI & PR Health | Twice daily: 08:10 + 20:10 UTC | Workflow health, job queue status, PR/issue quality. Cleans up stale working labels (agent:working, pr-patrol:working). | Active |
| Render Quality Monitor | Every 6 hours | Playwright render audit against production. Catches display regressions (raw numbers, JSON leaks). Auto-creates/closes GitHub issues. | Active |
| Scheduled Maintenance | Weekdays 07:30 UTC / Sunday 09:00 / Monthly 1st 10:00 | Claude Code maintenance sweeps. Daily: review merged PRs. Weekly: full sweep (triage, cruft). Monthly: deep cleanup. | Active |
| Sourcing | Weekly Monday 04:00 UTC | Periodic sourcing verification of facts and records against cited URLs using Anthropic Batch API (50% cost discount). | Active |
| Sourcing Recheck | Weekly Monday 08:00 UTC | Re-verifies stale sourcing verdicts. Detects when previously-verified data becomes contradicted by updated sources. | Active |
| Wiki Server Dead-Man-Switch | Weekly Monday 06:00 UTC | Verifies groundskeeper daemon is alive and monitoring wiki-server health. Catches the case where groundskeeper itself goes down. | Active |
GitHub Actions -- Event-Triggered
| Name | Schedule | Description | Status |
|---|---|---|---|
| Sync Entities & Facts | On push to main/production + Weekly Monday 07:00 UTC | Syncs data/entities and packages/factbase YAML to PG. Weekly run catches stale entries from missed triggers. | Active |
| CI | On push to main/production + PRs | Full CI pipeline: build data, validate gate, tests, TypeScript checks, Vercel deploy (production branch). | Active |
| Wiki Server Build & Deploy | On push to production (paths: apps/wiki-server/**) | Docker build, smoke test, GHCR push, ArgoCD deploy, post-deploy verification, auto-rollback on failure. | Active |
| Worker Build & Deploy | On push to production (paths: crux/worker/**, docker/worker/**) | Docker build and deploy for the K8s job worker daemon. | Active |
| Groundskeeper Build & Push | On push to production (paths: apps/groundskeeper/**) | Docker image build and push to GHCR for the groundskeeper daemon. | Active |
| Discord Bot Build & Push | On push to production (paths: apps/discord-bot/**) | Docker image build and push to GHCR for the Discord bot. | Active |
| Auto-Enqueue on Auto-Merge | On auto_merge_enabled event | When auto-merge is enabled after checks pass, enqueues PR into merge queue via GraphQL (GitHub limitation workaround). | Active |
| Auto-Update Review Gate | On PR open/sync/label (auto-update PRs only) | Citation audit gate for auto-update PRs. Runs citation audit, fixes inaccuracies, posts summary. | Active |
| E2E Post-Deploy Smoke Tests | After CI completes on production | Playwright E2E tests against live production site after deploy. | Active |
| E2E PR Check | On PRs (paths: apps/web/src/**, e2e/**) | Render quality audit and no-raw-IDs specs against a local build of the PR. Optional (non-required) check. | Active |
GitHub Actions -- Manual / Disabled
| Name | Schedule | Description | Status |
|---|---|---|---|
| Auto-Update Wiki | Manual dispatch only (schedule disabled) | News-driven automatic wiki updates. Schedule was daily 06:00 UTC, now disabled in favor of groundskeeper auto-update-enqueue task. | Disabled |
| Create Release PR | Manual dispatch only | Creates or updates a PR from main to production with auto-generated changelog. Does not auto-merge. | Manual |
| Database Restore | Manual dispatch only | Restores database from a backup artifact. Requires confirmation input. | Manual |
| Sync Data to Wiki Server | Manual dispatch only | Syncs session logs and auto-update runs to wiki-server. Push-path triggers removed since data is now DB-stored. | Manual |
| Resolve Merge Conflicts | Manual dispatch only (schedule disabled) | Finds conflicted PRs and resolves via two-tier approach: Sonnet API for text, Claude Code CLI for complex cases. | Disabled |
| Wikidata Enrichment | Manual dispatch only | Enriches person entities with structured facts from Wikidata (birth year, education, etc.). | Manual |
Groundskeeper Tasks (K8s Daemon)
| Name | Schedule | Description | Status |
|---|---|---|---|
| GK: Health Check | Every 5 min | Checks wiki-server availability and PG connectivity. Creates/closes GitHub issues with incident buffering and cooldowns. | Active |
| GK: Job Worker Health | Every 5 min | Monitors job queue health: pending backlog, failure rates, stuck jobs. | Active |
| GK: Session Sweep | Every 4 hours | Cleans up stale agent sessions and working labels. | Active |
| GK: Snapshot Retention | Daily 03:00 UTC | Prunes old page snapshots beyond retention limit (default: keep 100 per page). | Active |
| GK: Data Quality Snapshot | Daily 06:00 UTC | Captures data quality metrics snapshot for trend tracking. | Active |
| GK: Auto-Update Enqueue | Daily 06:00 UTC | Enqueues auto-update jobs for wiki pages. Budget: \$30/run, max 5 pages. | Active |
| GK: Job Failure Triage | Every 6 hours | Groups recent failed jobs by type and error pattern to surface recurring failures. | Active |
| GK: GitHub Shadowban Check | Daily 09:00 UTC | Checks configured GitHub usernames for shadowban status. Requires TASK_GITHUB_SHADOWBAN_CHECK_USERNAMES env. | Active |
| GK: Issue Responder | Every 15 min (disabled) | Auto-responds to new GitHub issues. Hard-disabled in code due to repeated failures. | Disabled |
| GK: TableBase Scan | Daily 05:00 UTC | Scans TableBase records for sourcing coverage and writes scanner-results artifacts used by the coverage dashboard. | Active |
K8s Long-Running Services
| Name | Schedule | Description | Status |
|---|---|---|---|
| Wiki Server (Hono) | Always-on | Hono API server (PG-backed). Serves wiki-server API, job queue, sync endpoints. Deployed via ArgoCD. | Active |
| Job Worker Daemon | Always-on (poll mode) | Long-lived Node.js worker that polls the job queue with concurrency control, memory watchdog, and health probes. | Active |
| Groundskeeper Daemon | Always-on (cron-based tasks) | Node.js daemon running scheduled tasks via node-cron. Circuit breaker per task (3 failures), half-open recovery, Discord notifications. | Active |
| Discord Bot | Always-on | Discord.js bot responding to @mention (wiki Q&A via Anthropic API) and /ask (Claude Code CLI). Rate-limited per user. | Active |
External Hosting
| Name | Schedule | Description | Status |
|---|---|---|---|
| Next.js Frontend (Vercel) | Deploy on push to main + production | Next.js 15 frontend. Main branch previews, production branch serves longtermwiki.com. Triggered by CI deploy hook. | Active |
Architecture Notes
Execution Tiers
- GitHub Actions -- ephemeral runners for CI, scheduled jobs, and on-demand work. Job Worker GHA polls every 30 min as backup; the K8s worker daemon handles real-time dispatch.
- K8s (ArgoCD) -- long-running services: wiki-server, job worker daemon, groundskeeper, Discord bot. Managed via Helm charts in the ops repo.
- Vercel -- Next.js frontend. Deploy hooks triggered by CI on production branch.
Key Overlaps
- Health monitoring has two layers: Groundskeeper checks every 5 min (primary), GHA Dead-Man-Switch checks weekly (catches groundskeeper failure).
- Auto-update has two paths: Groundskeeper enqueues jobs daily (active), GHA auto-update workflow is disabled (was the original mechanism).
- Job execution has two paths: K8s worker daemon (always-on, real-time) and GHA Job Worker (polls every 30 min as fallback).
Daily Timeline (UTC)
- 02:00 -- Database backup
- 03:00 -- Snapshot retention cleanup (GK)
- 04:00 -- Wiki-server data export + Source-check (Mon)
- 04:30 -- Resources snapshot refresh
- 05:00 -- TableBase scan (GK)
- 06:00 -- Data quality snapshot (GK) + Auto-update enqueue (GK)
- 07:00 -- Daily data validation + Entity/fact sync (Mon)
- 07:30 -- Scheduled maintenance (weekdays)
- 08:00 -- Server health + Source-check recheck (Mon) + Dead-man-switch (Mon)
- 08:05 -- Frontend health
- 08:10 -- CI & PR health
- 09:00 -- GitHub shadowban check (GK) + Weekly maintenance (Sun)
- 20:00 -- Server health (second run)
- 20:05 -- Frontend health (second run)
- 20:10 -- CI & PR health (second run)