Skip to main content

Upgrade helmdeck

Operator-facing upgrade procedure. Today's reality is the Compose-stack path (git pull && make install); the Kubernetes/Helm path previews here and ships fully with v1.0 (Phase 7).

For client-side upgrades (OpenClaw, Claude Code, Gemini CLI, etc.) see integrations/openclaw-upgrade-runbook.md and the per-client integration docs.

1. Pre-upgrade checklist

Run through this before starting any upgrade. ~5 minutes.

Record the current version:

# From your helmdeck checkout:
git describe --tags --always
# → v0.9.0 (or v0.9.0-39-gc76f707 if you're past a tag)

# Or via the running container:
docker inspect helmdeck-control-plane --format '{{.Config.Image}}'
# → ghcr.io/tosin2013/helmdeck:dev (build-time tag)

# Confirm the running binary's commit (logged once at startup):
docker logs helmdeck-control-plane 2>&1 | grep 'helmdeck control-plane starting' | tail -1
# → "version":"dev","commit":"unknown" (when built locally without -ldflags)
# → "version":"v0.10.0","commit":"abcd123" (when built from a tag via the Makefile)

Back up the SQLite database. helmdeck.db carries vault credentials, audit logs, and the keystore — losing it is operator-facing data loss:

cp /var/lib/helmdeck/helmdeck.db /var/lib/helmdeck/helmdeck.db.bak-$(date +%Y%m%d-%H%M%S)
# OR if running via Compose with the `helmdeck-data` volume:
docker run --rm -v helmdeck-data:/data -v /tmp:/backup alpine \
sh -c 'cp /data/helmdeck.db /backup/helmdeck.db.bak-$(date +%Y%m%d-%H%M%S)'

Snapshot the vault credential names (NOT the contents — secrets stay in vault):

JWT=$(./bin/control-plane -mint-token admin -mint-token-scopes admin)
curl -fsS http://localhost:3000/api/v1/vault/credentials \
-H "Authorization: Bearer $JWT" \
| python3 -c 'import sys,json; print("\n".join(c["name"] for c in json.load(sys.stdin)))' \
> /tmp/helmdeck-creds-pre-upgrade.txt
wc -l /tmp/helmdeck-creds-pre-upgrade.txt
# Compare post-upgrade to confirm count matches.

Read the new version's CHANGELOG entry. Specifically scan for any ### Breaking sub-sections:

git fetch --tags
git log --oneline v0.9.0..v0.10.0 # or whichever tag you're moving to
git show v0.10.0:CHANGELOG.md | head -80

Breaking changes usually mean: pack-input-schema changed, a vault credential name changed, or a runtime requirement bumped. For non-breaking minor releases (most), the procedure below is just git pull && make install.


2. In-place Compose-stack upgrade

This is the supported path on a single-host Compose deployment. Idempotent — re-running is safe.

cd /path/to/helmdeck
git fetch --tags
git checkout v0.10.0 # or whatever tag you're moving to
make sidecars # rebuilds helmdeck-sidecar:dev with any new tools (ffprobe, ctags, …)
make install # idempotent: re-runs preflight, rebuilds control-plane image, recreates the container

What make install does (post-checkout):

  1. Re-runs scripts/install.sh preflight — verifies Docker, Go, sufficient memory, exposed ports
  2. Rebuilds ghcr.io/tosin2013/helmdeck:dev (the control-plane image)
  3. Recreates the helmdeck-control-plane container — data volumes (helmdeck-data, helmdeck-artifacts-garage) persist
  4. Starts dependent services (Garage, Firecrawl/Docling overlays if previously enabled)

Time: ~3 minutes on a warm Docker layer cache; ~8 minutes on a cold rebuild.

Brief downtime: ~30 seconds while the control-plane container restarts. In-flight pack calls error out — operators with mid-flight workflows should wait for them to complete before running make install.

Sidecar refresh: The make sidecars step rebuilds helmdeck-sidecar:dev. Sessions started before the upgrade keep their old sidecar image; only sessions created after the upgrade pick up the new one. Either drop existing sessions (UI: Sessions panel → terminate) or wait for the 5-min watchdog to expire them.

After make install returns successfully, run §5 (Post-upgrade validation) before declaring done.

After every release: re-stamp the OpenClaw skill

If you have an OpenClaw client wired in, re-run the configure script so the new SKILLS.md stamps into the OpenClaw container:

./scripts/configure-openclaw.sh

See docs/RELEASES.md §"Agent sync checklist — every release" for the full per-release checklist (operator + agent-side both).


3. Schema migrations

helmdeck embeds its SQL migrations into the binary (internal/store/migrations/*.sql) and applies any not-yet-applied migrations automatically on every startup via store.Open. The procedure is:

  1. Open the database file
  2. Read the schema_migrations table to determine the highest-applied version
  3. Apply any newer files in version order, each in a transaction
  4. Record the new version in schema_migrations

You don't run migrate up manually — it happens at boot. If a migration fails, helmdeck-control-plane refuses to start and logs the SQL error; revert to the prior version (§6 Rollback) and file an issue.

To verify migrations applied post-upgrade:

docker exec helmdeck-control-plane sqlite3 /data/helmdeck.db \
'SELECT version, applied_at FROM schema_migrations ORDER BY version'

You should see one row per migration in internal/store/migrations/. If a version is missing AND the binary started cleanly, that migration is no-op (e.g. an additive CREATE TABLE IF NOT EXISTS) — not a problem.

Note: This auto-apply behavior makes upgrades safe but rollbacks tricky — see §6.


4. Kubernetes / Helm path (preview, GA in v1.0)

⚠️ Coming with v1.0 (Phase 7). The full Helm chart, KEDA scaling, NetworkPolicy isolation, External Secrets integration, and OpenTelemetry Collector ship with milestone v1.0. The procedure below is the planned shape; it may change before GA. Track progress on #5 (Helm chart) and #7 (pod template).

The eventual K8s upgrade flow:

# Install (first time):
helm repo add helmdeck oci://ghcr.io/tosin2013/charts
helm install helmdeck helmdeck/baas-platform \
--namespace helmdeck --create-namespace \
--values my-values.yaml

# Upgrade in place:
helm upgrade helmdeck helmdeck/baas-platform \
--namespace helmdeck \
--values my-values.yaml \
--version 1.1.0

The Helm chart will:

  • Run schema migrations as a Job before the Deployment rolls (so a failed migration aborts the rollout instead of taking down the running pods)
  • Roll the helmdeck-control-plane Deployment with maxSurge=1, maxUnavailable=0 — zero-downtime upgrade
  • Trigger sidecar-image refresh by bumping the image.tag value (sidecar pods are spun up per-session, so existing sessions drain naturally)
  • Preserve the PostgreSQL StatefulSet (or external DB pointer) across the upgrade
  • Use Helm's built-in helm rollback for one-step rollback (see §6)

Until v1.0 ships, operators in production should use the Compose path with the documented database backups in §1.


5. Post-upgrade validation

After the new control-plane container is up, run these checks. They take ~2 minutes total.

# (1) Healthz
curl -fsS http://localhost:3000/healthz
# → {"status":"ok"}

# (2) New pack count matches expectations
JWT=$(./bin/control-plane -mint-token admin -mint-token-scopes admin)
curl -fsS http://localhost:3000/api/v1/packs -H "Authorization: Bearer $JWT" \
| python3 -c 'import sys,json; print(len(json.load(sys.stdin)), "packs")'
# v0.9.0 → 36; v0.10.0 → 38; cross-check against the new release's docs/PACKS.md

# (3) Vault credential count unchanged
curl -fsS http://localhost:3000/api/v1/vault/credentials -H "Authorization: Bearer $JWT" \
| python3 -c 'import sys,json; print(len(json.load(sys.stdin)), "creds")'
# Compare to the count from the pre-upgrade snapshot in §1

# (4) Smoke pack call (no external dependencies)
curl -fsS -X POST http://localhost:3000/api/v1/packs/browser.screenshot_url \
-H "Authorization: Bearer $JWT" -H 'Content-Type: application/json' \
-d '{"url":"https://example.com"}' \
| python3 -m json.tool | head -10
# Should return an artifact_key + size > 50000

# (5) Audit log has fresh startup entry
docker logs helmdeck-control-plane 2>&1 | grep 'control-plane starting' | tail -1
# → "version":"v0.10.0","commit":"<short-sha>"

If any check fails, gather logs (docker logs helmdeck-control-plane) and consider rolling back (§6).


6. Rollback

If the upgrade misbehaves and you need to revert:

Compose path

cd /path/to/helmdeck
git checkout v0.9.0 # or the prior known-good tag
make install

Database compatibility: helmdeck's migrations are additive by convention — new versions add tables/columns; they don't drop or alter columns the prior version reads. So a v0.9.0 binary running against a database that v0.10.0 migrated should work, ignoring the new columns. Exception: if the new version's migrations include a destructive change (rare; flagged in the CHANGELOG ### Breaking section), restore the database backup from §1:

docker compose -f deploy/compose/compose.yaml stop control-plane
cp /var/lib/helmdeck/helmdeck.db.bak-<timestamp> /var/lib/helmdeck/helmdeck.db
docker compose -f deploy/compose/compose.yaml start control-plane

Kubernetes path (v1.0+)

helm rollback helmdeck # one revision back
helm rollback helmdeck 5 # specific revision number
helm history helmdeck # see all revisions

Helm tracks revisions; a rollback runs the previous chart version's Job (which is a no-op for additive migrations or reverses the migration if the chart shipped a down.sql — most don't).


7. Version-specific notes

CHANGELOG.md is the canonical source. Pre-upgrade, scan the section for the version you're moving to:

SectionMeans
### AddedNew packs / endpoints / fields. Backward-compatible.
### ChangedBehavior shifts that may surprise existing callers but don't error. Read carefully.
### FixedBug fixes. Usually safe; sometimes a fix changes observable behavior (e.g. PR #105 fixed vision.click_anywhere to actually use post-action screenshots — agents written against the broken behavior may need their prompts adjusted).
### BreakingSchema or contract changes that will break existing integrations. Operator action required. RARE in helmdeck — we try to keep input/output schemas additive.
### RemovedFeatures dropped. Will break callers that used them. Should be preceded by a deprecation notice in a prior release.

For the v0.9.0 → v0.10.0 hop specifically: non-breaking. Adds blog.publish + podcast.generate packs, fixes vision.click_anywhere per #102 (improvement; existing callers see better behavior), bumps pack count 36 → 38. No schema-removal, no input-shape change to existing packs.


See also