tentacle-orchestrator
tentacle-orchestrator
Bare-metal service orchestrator for the tentacle platform. Watches the desired_services NATS KV bucket and reconciles with systemd — downloading, installing, version-switching, and starting/stopping modules automatically.
Analogy: NATS KV is etcd, tentacle-orchestrator is the kubelet, systemd units are pods.
Overview
desired_services KV (source of truth)
│
▼
tentacle-orchestrator (reconciler)
│
├── checks installed versions on disk
├── downloads from GitHub releases if needed
├── manages symlinks to active version
├── starts/stops systemd units
│
▼
service_status KV (reports back)
What orchestrator manages: All modules except NATS, Deno runtime, and itself. What install.sh manages: NATS, Deno runtime, orchestrator itself, initial bootstrap.
Key Architecture Decisions
-
KV-driven reconciliation: The
desired_servicesKV bucket is the single source of truth. Write a key to install/start a module; delete it to stop managing it. No imperative API. -
Two reconciliation mechanisms:
- KV Watch (reactive): Watches
desired_servicesfor changes, triggers immediate reconcile - Periodic sweep (defensive, every 30s): Catches drift from crashes, manual
systemctl, etc.
- KV Watch (reactive): Watches
-
Symlink-based versioning: Each version is stored in its own directory under
versions/{moduleId}/{version}/. The active version is a symlink inbin/(Go) orservices/(Deno). Rollback = update KV version. -
Offline resilience: If
version: "latest"but no internet, falls back to the highest locally-installed version. If no local version exists, setsreconcileState: "version_unavailable". -
Bootstrap from existing installs: On first boot, scans existing
bin/andservices/directories, moves files toversions/{moduleId}/unknown/, and populatesdesired_serviceswith current systemd state.
NATS KV Buckets
desired_services (no TTL)
One key per module. Missing key = orchestrator ignores that module.
type DesiredServiceKV = {
moduleId: string; // e.g. "tentacle-mqtt"
version: string; // e.g. "0.0.5" or "latest"
running: boolean; // should the systemd unit be active?
updatedAt: number; // epoch ms
};
service_status (TTL 120s)
Written by the orchestrator every reconcile loop. Auto-expires if orchestrator dies.
type ReconcileState = "ok" | "pending" | "downloading" | "installing"
| "starting" | "stopping" | "error" | "version_unavailable";
type ServiceStatusKV = {
moduleId: string;
installedVersions: string[];
activeVersion: string | null;
systemdState: "active" | "inactive" | "failed" | "not-found";
reconcileState: ReconcileState;
lastError: string | null;
runtime: ModuleRuntime; // "go" | "deno" | "deno-web"
category: ModuleCategory; // "core" | "optional"
repo: string;
updatedAt: number;
};
Relationship to service_enabled
desired_services.running= should the systemd process be running?service_enabled= should a running service do work vs idle heartbeat-only?- Orthogonal:
running: true+enabled: false= process alive but paused
Reconciliation Algorithm
Per-module, each reconcile cycle:
- Resolve version — if
"latest", query GitHub API (cached 5 minutes). If offline, use highest local version. - Download if missing — if resolved version not on disk, download from GitHub releases and install to
versions/{moduleId}/{version}/. - Switch version — if active version differs from desired, stop service, update symlink, regenerate systemd unit,
daemon-reload. - Start/stop — if
runningstate doesn't match systemd state, start or stop as needed. - Report status — write
ServiceStatusKVto theservice_statusKV bucket.
Version Storage Layout
/opt/tentacle/
versions/ # Versioned storage
tentacle-mqtt/
0.0.5/ # Deno source
deno.json, main.ts, ...
0.0.6/
deno.json, main.ts, ...
tentacle-snmp/
0.0.3/ # Go binary
tentacle-snmp
bin/
tentacle-snmp -> ../versions/tentacle-snmp/0.0.4/tentacle-snmp
deno # NOT managed by orchestrator
nats-server # NOT managed by orchestrator
services/
tentacle-mqtt -> ../versions/tentacle-mqtt/0.0.6/
tentacle-web -> ../versions/tentacle-web/0.0.7/
Download URLs (per runtime)
| Runtime | Release Asset | Install Location | Symlink |
|---|---|---|---|
| go | {moduleId}-linux-{arch} | versions/{moduleId}/{ver}/{moduleId} | bin/{moduleId} |
| deno | {repo}-src.tar.gz | versions/{moduleId}/{ver}/ | services/{repo} |
| deno-web | {repo}-build.tar.gz | versions/{moduleId}/{ver}/ | services/{repo} |
Module Registry
Static registry in types/registry.ts mirrors install.sh's MODULES array. Each entry defines repo, moduleId, category, runtime, and optional extraEnv.
The orchestrator only manages modules listed in this registry. Unknown moduleIds in desired_services are logged and skipped.
Configuration
Environment variables (all optional — sensible defaults for production):
| Variable | Default | Description |
|---|---|---|
NATS_SERVERS | localhost:4222 | NATS server address |
TENTACLE_INSTALL_DIR | /opt/tentacle | Base install directory |
TENTACLE_SYSTEMD_DIR | /etc/systemd/system | Systemd unit directory |
TENTACLE_NATS_UNIT | tentacle-nats | NATS systemd unit name (without .service) |
TENTACLE_GH_ORG | joyautomation | GitHub org for release downloads |
TENTACLE_RECONCILE_INTERVAL | 30000 | Periodic sweep interval (ms) |
TENTACLE_LATEST_CACHE_TTL | 300000 | Cache duration for "latest" version resolution (ms) |
GraphQL API
The orchestrator's KV data is exposed via tentacle-graphql:
Queries:
desiredServices: [DesiredService]— list all entries indesired_servicesKVserviceStatuses: [ServiceStatus]— list all entries inservice_statusKV
Mutations:
setDesiredService(moduleId, version, running): DesiredService— create/update a desired service entrydeleteDesiredService(moduleId): Boolean— remove a desired service entry
Key Files
| File | Purpose |
|---|---|
main.ts | Entry point: NATS connect, heartbeat, start reconciler |
types/config.ts | Env-based config with defaults |
types/registry.ts | MODULE_REGISTRY constant |
reconciler/reconciler.ts | Main loop (KV watch + periodic sweep) |
reconciler/download.ts | GitHub release download, version resolution |
reconciler/install.ts | Extract, install, symlink management |
reconciler/systemd.ts | systemctl start/stop/status, unit file generation |
reconciler/status.ts | Write ServiceStatusKV to KV |
reconciler/migration.ts | Bootstrap from pre-orchestrator installs |
nats/client.ts | NATS connection, KV handles |
Gotchas
- Self-update: The orchestrator cannot restart itself in-place. It writes a shell script that updates the symlink and runs
systemctl restart tentacle-orchestrator, then executes it fire-and-forget. - NATS unit name: In dev environments where NATS runs as
nats.serviceinstead oftentacle-nats.service, setTENTACLE_NATS_UNIT=nats. - Bootstrap marker: After first migration, writes
/opt/tentacle/config/.orchestrator-migratedto avoid re-migrating on restart. - Rate limiting: GitHub API calls for "latest" resolution are cached for 5 minutes to avoid rate limits.