Skip to content

Deploying Alcoves

Alcoves ships as a single Docker image that runs the whole stack — the Go API, the async worker, and the Nuxt (Nitro) frontend — plus two external data stores (PostgreSQL and Dragonfly). This page covers everything an operator needs to run a production instance: the runtime topology, the Helm chart, environment variables, ingress configuration, and the most important operational gotchas.


A production Alcoves deployment runs two processes (the Nitro frontend and the Go API/worker) plus two external data stores. Both processes live inside the same image — the container’s entrypoint supervises them together:

┌──────────────────────────────────────────┐
browser ──────────▶│ reverse proxy / ingress │
│ /api/** ──▶ Go API :3001 │
│ / (everything else, incl. /s/**) ─▶ Nitro :3000 │
└─────────────┬───────────────┬────────────┘
│ │
┌─────────────────────┴───────────────┴─────────────────────┐
│ one container (ghcr.io/rustyguts/alcoves) │
│ ┌────────────────────┐ ┌────────────────────────┐ │
│ │ Nuxt 4 (Nitro) │ │ Go API/worker (Echo) │ │
│ │ :3000 │ │ :3001 │ │
│ │ UI + SSR /s/** │ ───▶ │ ALCOVES_MODE= │ │
│ │ + /api proxy │ │ all | api | worker │ │
│ └────────────────────┘ └──────┬───────┬─────────┘ │
└─────────────────────────────────────┼───────┼─────────────┘
│ │
┌────────────────▼──┐ ┌─▼──────────────────┐
│ PostgreSQL 18 │ │ Dragonfly (Redis) │
│ + pgvector │ │ Asynq job queue + │
│ :5432 │ │ activity pub/sub │
└───────────────────┘ └────────────────────┘
ProcessDefault portRole
Nuxt 4 (Nitro)3000Serves the UI, SSRs /s/** (public share pages), and proxies /api/** to the Go API. All other routes are client-rendered.
Go API (Echo)3001All /api/** HTTP endpoints and the async worker pool.
PostgreSQL 18 + pgvector5432System of record. pgvector is required from the first migration (512-dim face embeddings).
Dragonfly (Redis-compatible)6379 prod / 6389 devBacks the Asynq job queue and the cross-process activity pub/sub bus.

Single-image quick start: docker run -p 3000:3000 ghcr.io/rustyguts/alcoves runs the whole stack. Nitro serves the UI on :3000 and proxies /api/** to the co-located Go API on 127.0.0.1:3001, so a single published port is enough to get going.

Routing contract (production): front the container with one reverse proxy. Route /api/** to the Go API on :3001 and everything else (including the SSR share pages at /s/**) to the Nitro server on :3000. Routing /api/** straight to :3001 — rather than letting Nitro proxy it — is what keeps video Range streaming intact (see Direct browser streaming). Publishing :3001 from the same container makes this possible without a second image.

The Go API and worker ship as a single binary. Set ALCOVES_MODE to select behavior:

ModeWhat runs
all (default)HTTP routes and the Asynq worker pool. Simplest option for small instances.
apiHTTP routes only — no worker goroutine. Scale the web tier independently.
workerAsynq worker only. No full HTTP registration (health + version probes remain). The activity WebSocket hub is not created.

Splitting into separate api and worker deployments lets you give the CPU/RAM-hungry ML workloads their own resources without affecting request latency.

Database migrations run automatically at API startup before any handler is registered. Deploying a new image version applies all pending schema changes without a separate migration step. Trigger a rolling upgrade with kubectl rollout restart deploy/<name>-api.


A single unified image is published to GitHub Container Registry on every tagged release:

ImageSourcePurpose
ghcr.io/rustyguts/alcoves:<version>root DockerfileThe whole stack — Go API/worker (libvips, ffmpeg, ONNX Runtime, whisper.cpp) and the Nuxt (Nitro) frontend, plus the Bun runtime that serves it

Tags follow semver: 0.x.y, 0.x, and latest (from main).

The container’s entrypoint takes a role argument (the image CMD, overridable via docker run … <role> or a Kubernetes args):

RoleWhat runs
all (default)Nitro (:3000) and the Go API+worker (:3001), supervised together. The simplest way to run everything in one container.
webOnly the Nitro server (UI + SSR share pages).
apiOnly the Go HTTP API (ALCOVES_MODE=api).
workerOnly the Go Asynq worker (ALCOVES_MODE=worker).

The single-role modes exist so the same image can back split deployments — see the Helm chart, which runs web, api, and worker as three separate workloads from this one image.

The image bundles all runtime dependencies for CPU-only ML inference plus the frontend:

  • libvips — image transforms, thumbnails, and proxy resizing
  • ffmpeg — video transcoding, thumbnail extraction, and audio waveform generation
  • ONNX Runtime v1.26.0 — face detection/recognition and COCO object detection
  • whisper.cpp — speech-to-text transcription (AVX/AVX2/FMA baseline; no AVX-512 requirement)
  • Bun + the Nuxt .output bundle — serves the UI and SSRs the public share pages

The running backend exposes its embedded version at GET /api/version:

{"version": "0.5.2", "commit": "abc1234", "buildTime": "", "mode": "all"}

A locally-built binary returns "version": "dev" because ldflags are not injected during go run.


Copy .env.example from the repo and fill in your values. All variables are prefixed ALCOVES_.

VariableDescription
ALCOVES_SESSION_SECRETAES-GCM key for encrypted session cookies. Must be ≥ 32 bytes. The API refuses to start without this. Generate with openssl rand -base64 32.
ALCOVES_DATABASE_URLPostgreSQL connection string. Must point to a pgvector-enabled database.
VariableDefaultDescription
ALCOVES_MODEallall, api, or worker
ALCOVES_ENVdevelopmentdevelopment or production. Controls CORS — localhost origins are only allowed in development.
ALCOVES_BASE_URLhttp://localhost:3000Public-facing URL. Drives OAuth redirect URIs, share links, and the primary CORS origin. Keep this accurate in production.
VariableDefaultDescription
ALCOVES_QUEUE_HOSTlocalhostDragonfly/Redis host
ALCOVES_QUEUE_PORT6389Port (default is 6379 in most Redis setups)
ALCOVES_QUEUE_PASSWORD(empty)Optional queue password
VariableDefaultDescription
ALCOVES_STORAGE_DRIVERlocallocal or s3
ALCOVES_STORAGE_PATH./dataRoot path for uploaded files (local driver)
ALCOVES_AVATAR_STORAGE_PATH./data/avatarsOverride path for user avatars
ALCOVES_CACHE_STORAGE_PATH./data/.cacheOverride path for derived/cached media

For S3-compatible storage, set ALCOVES_STORAGE_DRIVER=s3 and provide:

VariableDescription
ALCOVES_S3_BUCKETBucket name
ALCOVES_S3_REGIONRegion
ALCOVES_S3_ENDPOINTCustom endpoint URL (for S3-compatible providers)
ALCOVES_S3_ACCESS_KEY_IDAccess key ID
ALCOVES_S3_SECRET_ACCESS_KEYSecret access key
ALCOVES_S3_FORCE_PATH_STYLESet to true for MinIO and similar
ALCOVES_S3_FILES_PREFIX / _AVATARS_PREFIX / _CACHE_PREFIXOptional key prefixes per scope
VariableDescription
ALCOVES_OAUTH_GOOGLE_CLIENT_IDGoogle OAuth client ID
ALCOVES_OAUTH_GOOGLE_CLIENT_SECRETGoogle OAuth client secret

The frontend probes GET /api/auth/providers at runtime, so the Google sign-in button appears automatically once these are set.

Model settings are boot-time fallbacks. Admins can override the whisper model, language, and audio-tagger selection at runtime from the admin panel (persisted in the database). Workers read admin settings first and fall back to env vars on a fresh install.

VariableDefaultDescription
ALCOVES_MODELS_PATH./data/.modelsONNX model cache directory
ALCOVES_WHISPER_MODELlarge-v3Default whisper model; overridable in admin
ALCOVES_WHISPER_LANGUAGEautoDefault transcription language
ALCOVES_WHISPER_VAD_MODELsilero-v6.2.0Voice activity detection model; empty disables it
ALCOVES_WHISPER_MODEL_BASE_URLhttps://s3.rustyguts.net/modelsWhere to download GGML whisper weights
ALCOVES_WHISPER_MODELS_DIR./data/.whisperWhisper model cache directory
ALCOVES_AUDIO_DETECT_MODEL_BASE_URLhttps://s3.rustyguts.net/modelsWhere to download audio-tagger ONNX models

Fine-tuning thresholds (sensible defaults, adjust if needed):

VariableDefaultDescription
ALCOVES_FACE_DETECTION_MIN_SCOREMinimum confidence for a detected face
ALCOVES_FACE_RECOGNITION_MAX_DISTANCEEmbedding distance threshold for clustering faces into people
ALCOVES_OBJECT_DETECTION_MIN_SCOREMinimum confidence for a detected object
ALCOVES_OBJECT_DETECTION_MAX_DETECTIONSCap on detected objects per image
ALCOVES_AUDIO_DETECT_WINDOW_SEC10.0Audio analysis window size
ALCOVES_AUDIO_DETECT_THRESHOLD0.2Minimum confidence for an audio event tag
ALCOVES_AUDIO_DETECT_TOP_K5Maximum audio tags per window

These are set on the Nuxt server, not the Go API:

VariableDefaultDescription
ALCOVES_API_URLhttp://localhost:3001Go backend URL for the Nitro dev proxy and SSR fetches. Set to the in-cluster API service address in production.
NUXT_PUBLIC_API_ORIGIN(empty)Important in production. When set, browsers fetch binary content (video, images, downloads) directly from this origin instead of through Nitro. Nitro can corrupt HTTP Range responses, so setting this avoids seekable-video and download issues.
NITRO_HOST / NITRO_PORT0.0.0.0:3000Override the Nitro server bind address

The Helm chart (helm/alcoves/) deploys Alcoves to Kubernetes. It does not deploy PostgreSQL or Dragonfly — operators supply their own.

  • Kubernetes 1.27+
  • An nginx ingress controller
  • cert-manager or another TLS source
  • pgvector-enabled PostgreSQL
  • A Redis-compatible queue (Dragonfly recommended)
  • Either an RWX-capable PersistentVolumeClaim or S3-compatible storage

The chart deploys three separate workloads, all from the one unified image. Each selects its role via container args:

WorkloadRole (args)Default replicasPurpose
backend-apiapi2Serves all HTTP requests
backend-workerworker1Runs Asynq jobs (ML inference, ffmpeg, transcription)
frontendweb2Nuxt Nitro server

All three pull the same image and differ only by role — so rolling out a new version is a single image.tag bump for the entire app.

The worker deployment deliberately has no CPU limit and a generous memory limit (default 10Gi). This is intentional: whisper large-v3 needs ~3.9 GB of RAM, and concurrent ffmpeg + ONNX inference can spike well past 4 GB. CFS CPU throttling hurts ML workload latency more than it helps isolation.

WorkloadCPU requestCPU limitMemory requestMemory limit
backend-api200m2512Mi2Gi
backend-worker2(none)4Gi10Gi
frontend100m1256Mi512Mi

The chart’s ingress routes by path prefix:

  • /api → backend-api service
  • / → frontend service (catch-all, including SSR share pages at /s/**)

Default ingress annotations enable TUS resumable uploads and seekable video out of the box:

nginx.ingress.kubernetes.io/proxy-body-size: "0" # unlimited — required for TUS uploads
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-buffering: "off"
nginx.ingress.kubernetes.io/proxy-request-buffering: "off"

TLS is enabled by default (ingress.tls.enabled: true), with a default secretName of alcoves-tls. Wire this to cert-manager or your own certificate provider.

For the local storage driver, the chart creates a PersistentVolumeClaim that both the API and worker pods mount at /app/data:

persistentVolume:
enabled: true
size: 200Gi
storageClass: "" # use cluster default
accessModes:
- ReadWriteMany # required when API or worker replicaCount > 1

Switch to S3 by setting storage.driver: s3 and providing your bucket credentials in values.yaml or via an existing secret. No PVC is created in S3 mode.

The chart generates a single Kubernetes Secret (<release>-app) for credentials. Every secret supports delegation to a pre-existing Secret so you can use an external secret manager:

Credentialvalues.yaml keyExisting secret key
Session secretsessionSecretexistingSessionSecret
Database URLdatabase.urldatabase.existingSecret
Queue passwordqueue.passwordqueue.existingSecret
Google OAuthoauth.google.clientId / clientSecretoauth.google.existingSecret
S3 credentialsstorage.s3.accessKeyId / secretAccessKeystorage.s3.existingSecret

sessionSecret is required — the chart fails if both sessionSecret and existingSessionSecret are empty.

Use extraEnv in values.yaml to pass additional environment variables (such as ML tuning thresholds) without forking the chart:

extraEnv:
- name: ALCOVES_FACE_DETECTION_MIN_SCORE
value: "0.85"
- name: ALCOVES_AUDIO_DETECT_THRESHOLD
value: "0.25"

Rolling out a new version restarts both the API and worker deployments. The API pod applies pending migrations before serving traffic. A safe upgrade sequence:

Terminal window
helm upgrade alcoves helm/alcoves/ --set image.tag=0.x.y
# or with kubectl after updating values.yaml:
kubectl rollout restart deploy/alcoves-api deploy/alcoves-worker

Both probes hit the same endpoint, which is always registered regardless of ALCOVES_MODE:

GET /api/health
→ {"status": "ok", "mode": "api"}

Liveness probe: initialDelaySeconds: 20, periodSeconds: 30 Readiness probe: initialDelaySeconds: 5, periodSeconds: 5

The version endpoint is also always available:

GET /api/version
→ {"version": "0.5.2", "commit": "abc1234", "buildTime": "…", "mode": "api"}

Alcoves uses an explicit CORS allowlist, not wildcard origins. ALCOVES_BASE_URL is the primary allowed origin. Additional origins can be added with ALCOVES_EXTRA_CORS_ORIGINS (comma-separated). Localhost variants are added automatically in development mode only.

Keep ALCOVES_BASE_URL accurate. The AllowCredentials: true CORS setting is safe only because the origin list is never reflected dynamically.


ONNX models for face detection and object detection are pre-fetched in a background goroutine at startup. A worker that has not finished downloading will block the first job that needs that model. Whisper and audio-tagger models download lazily on first use. Plan for the first few jobs after a fresh deployment to run slower than steady-state.

Set NUXT_PUBLIC_API_ORIGIN so browsers fetch video and large files directly from the API instead of routing through the Nuxt server. Without it, Nitro acts as a passthrough for binary responses — and Nitro can corrupt HTTP Range responses, breaking video seeking and download resumption.

Shared storage is mandatory for multi-replica

Section titled “Shared storage is mandatory for multi-replica”

The API and worker pods both read and write /app/data (or the equivalent S3 prefix). With more than one replica for either, shared storage is not optional — use an RWX PVC or S3.

The Helm default of 10Gi memory for the worker reflects real production peaks. If you reduce this, expect the worker to OOM-kill during large transcription or concurrent ffmpeg + ONNX jobs.


Alcoves is alpha (0.x.y). Releases are cut automatically by release-please whenever Conventional Commit PRs land on main:

  • feat: → minor bump (0.x.0)
  • fix: / perf: / refactor: / docs: → patch bump (0.x.y+1)
  • chore: / style: → hidden; no release PR

Images are built and pushed to GHCR on every tagged release with both the full semver tag (0.x.y) and the minor alias (0.x). Pin to a full semver tag in production to avoid unexpected updates.