- Rust 99%
- Shell 1%
Platform operations API (Axum :5300) and Ratatui TUI for self-hosted infrastructure stacks. Aggregates cluster state from Consul, Nomad, Vault, and Garage with background polling, live proxy routes, API key auth, audit logging, and GPU mode switching. |
||
|---|---|---|
| api | ||
| tui | ||
| .gitignore | ||
| Cargo.toml | ||
| deploy.sh | ||
| README.md | ||
Spire
Platform operations API and terminal UI for self-hosted infrastructure stacks. Aggregates cluster state from Consul, Nomad, Vault, and Garage into a unified API, and presents it through a Ratatui TUI with live navigation and mutation support.
Architecture
spire-tui (Ratatui, polls every 5s)
│
│ HTTP/TLS + API key
▼
┌──────────┐
│ spire-api │ Axum · :5300 · API key auth · audit log
│ │ background poller (15s)
└─────┬─────┘
│
┌────┼────┬─────────┐
▼ ▼ ▼ ▼
Consul Nomad Vault Garage
The background poller fetches state from all four upstreams every 15 seconds and caches it in memory. The TUI reads the cached snapshot on each poll cycle. Live proxy routes (detail views, mutations) bypass the cache and hit upstreams directly.
Workspace Structure
spire/
├── api/ spire-api binary (Axum)
│ ├── src/
│ │ ├── main.rs Config → DB pool → poller → router → server
│ │ ├── config.rs Config::from_env()
│ │ ├── auth.rs API key validation (argon2id) + Oathkeeper upstream headers
│ │ ├── audit.rs Audit log writes
│ │ ├── db.rs SQLx pool setup
│ │ ├── health.rs Cluster health scoring (Consul, Nomad, Vault, Patroni)
│ │ ├── state.rs ClusterState type + shared Arc<RwLock<>>
│ │ ├── poller.rs Background task — fetches all upstreams, writes state
│ │ ├── proxy/
│ │ │ ├── consul.rs Consul members, services, health
│ │ │ ├── nomad.rs Nomad jobs, nodes, allocations, stop/start/restart
│ │ │ ├── vault.rs Vault seal status, HA leader
│ │ │ └── garage.rs Garage admin status and layout
│ │ └── routes/ HTTP handlers (cluster, consul, nomad, vault, garage, gpu, auth, health)
│ └── migrations/ SQLx migrations (api_keys, audit_log)
└── tui/ spire-tui binary (Ratatui)
└── src/
├── main.rs Config load → client init → event loop
├── config.rs ~/.config/spire/config.toml
├── client.rs HTTP client wrapper (spire-api calls)
├── app.rs App state, keybindings, modal handling
├── modal.rs Confirm, GpuConfirm, GpuIdleSelect modal variants
└── tabs/
├── overview.rs Cluster summary — all services at a glance
├── services.rs Consul services table
├── jobs.rs Nomad jobs table with mutable indicators
├── data.rs Patroni topology + Garage storage layout
└── secrets.rs Vault seal status and HA node list
Tech Stack
- API framework: Axum 0.8 + Tokio
- Database: SQLx 0.8 + PostgreSQL (API keys, audit log)
- HTTP client: reqwest 0.13 + rustls
- TLS: rustls-pemfile + native-tls optional
- Auth: argon2id API keys + Oathkeeper upstream headers (
X-User-Id/X-User-Scope) - TUI: Ratatui + crossterm
- Config: TOML (TUI) + environment variables (API)
API Routes
GET /health Liveness
GET /health/ready Readiness (upstreams reachable)
GET /api/v1/cluster Full cached cluster state (JSON)
GET /api/v1/cluster/health Health summary (overall + per-subsystem)
GET /api/v1/consul/members Consul member list (live)
GET /api/v1/consul/services Consul service catalog (live)
GET /api/v1/consul/service/{name} Service health detail (live)
GET /api/v1/nomad/nodes Nomad node list (live)
GET /api/v1/nomad/jobs Nomad job list (live)
GET /api/v1/nomad/job/{id} Job detail (live)
GET /api/v1/nomad/job/{id}/allocations Job allocations (live)
POST /api/v1/nomad/job/{id}/stop Stop job
POST /api/v1/nomad/job/{id}/start Start job
POST /api/v1/nomad/job/{id}/restart Restart job (stop + start)
POST /api/v1/gpu/switch Toggle GPU mode (ai ↔ stream)
GET /api/v1/vault/status Vault seal status (live)
GET /api/v1/vault/leader Vault HA leader (live)
GET /api/v1/garage/status Garage cluster status (live)
GET /api/v1/garage/layout Garage storage layout (live)
GET /api/v1/auth/verify Verify current API key
POST /api/v1/auth/keys Create API key
GET /api/v1/auth/keys List API keys (own keys; admin sees all)
DELETE /api/v1/auth/keys/{id} Revoke API key
GPU Mode Switching
Spire tracks a binary GPU mode — ai (inference workloads) or stream (game streaming). Jobs opt in via Nomad meta tags:
spire_gpu_group = "ai"— belongs to the AI workload groupspire_gpu_group = "stream"— belongs to the streaming groupspire_mutable = "true"— can be individually stopped/started from the TUI
POST /api/v1/gpu/switch stops all jobs in the active group and starts all jobs in the target group in parallel (via Tokio JoinSet). The G key in the TUI triggers the switch with a confirmation modal.
Jobs with spire_gpu_group set display their group label (highlighted when active) in the Jobs tab indicator column.
Configuration
API (spire-api) — Environment Variables
| Variable | Required | Default |
|---|---|---|
DATABASE_URL |
yes | — |
HOST |
no | 0.0.0.0 |
PORT |
no | 5300 |
TLS_CERT_PATH |
no | — |
TLS_KEY_PATH |
no | — |
CONSUL_ADDR |
no | https://consul.service.consul:8501 |
NOMAD_ADDR |
no | https://nomad.service.consul:4646 |
VAULT_ADDR |
no | https://vault.service.consul:8200 |
GARAGE_ADMIN_URL |
no | — |
CONSUL_HTTP_TOKEN |
no | — |
NOMAD_TOKEN |
no | — |
VAULT_TOKEN |
no | — |
GARAGE_ADMIN_TOKEN |
no | — |
CA_CERT_PATH |
no | /etc/ssl/certs/platform-ca.crt |
SPIRE_POLL_INTERVAL |
no | 15 |
Service name defaults (patroni, garage) assume Consul service registrations match those names. Adjust your Consul service definitions or override via your platform's environment injection if your service names differ.
TUI (spire-tui) — ~/.config/spire/config.toml
api_url = "https://spire.service.consul:5300"
api_key = "sk-spire-..."
poll_interval = 5 # seconds between API polls
ca_cert = "/path/to/ca.crt" # optional; required if API uses a private CA
Quick Start
# Build both binaries
cargo build --release
# Run database migrations
sqlx migrate run --source api/migrations
# Start the API
DATABASE_URL=postgres://... \
CONSUL_HTTP_TOKEN=... \
NOMAD_TOKEN=... \
VAULT_TOKEN=... \
./target/release/spire-api
# Configure and start the TUI
mkdir -p ~/.config/spire
cat > ~/.config/spire/config.toml <<EOF
api_url = "http://localhost:5300"
api_key = "sk-spire-..."
EOF
./target/release/spire-tui
TUI Keybindings
| Key | Context | Action |
|---|---|---|
1–5 |
any | Switch tab |
↑ / k |
list | Move up |
↓ / j |
list | Move down |
Enter |
list | Open detail view |
Esc |
detail | Back to list |
r |
any | Force refresh |
s |
Jobs (mutable, running) | Stop job |
g |
Jobs (mutable, dead) | Start job |
R |
Jobs (mutable, running) | Restart job |
G |
Jobs / Overview | Toggle GPU mode |
q |
any | Quit |
Deployment
deploy.sh builds both binaries, packages them with migrations into a tarball, and uploads to S3-compatible object storage via rclone. The Nomad job pulls and unpacks the artifact on allocation.
Edit the REMOTE and BUCKET variables at the top of deploy.sh to match your rclone remote and bucket name.
License
MIT