Skip to main content

Helmdeck — Implementation Task Breakdown

Generated from docs/adrs/001030 and PRD §16 roadmap. Each task lists its source ADR(s) and prerequisite tasks. IDs are stable for cross-reference.

Legend: P0 blocker / critical path · P1 required for phase exit · P2 important but parallelizable · P3 nice-to-have


Phase 1 — Core Infrastructure (Weeks 1–4)

Goal: ephemeral browser sessions callable via REST, single-node Compose deploy.

IDTaskPriADRsDepends on
T101Bootstrap Go module github.com/tosin2013/helmdeck, set up cmd/control-plane, cmd/helmdeck-mcp, internal/ layoutP0002
T102Wire goreleaser + GitHub Actions: build matrix (linux/darwin/windows × amd64/arm64), cosign signing, distroless image to ghcr.ioP0002, 030T101
T103Define SessionRuntime interface; implement Docker SDK backend (spawn, exec, logs, terminate)P0001, 004, 009T101
T104Browser sidecar Dockerfile: Ubuntu base, headless Chromium, Marp, Tesseract (eng), ffmpeg, xdotool, scrot, Xvfb, XFCE4, noVNC, font packsP0001, 014, 015, 018, 019T101
T105Session lifecycle: create/list/get/terminate REST endpoints with shm_size, timeout, maxTasks, memory/cpu limits; watchdog goroutine for leak/timeout recycleP0004T103
T106CDP integration via chromedp: navigate, extract, screenshot, execute, interact endpointsP0002T105
T107JWT auth middleware (Gin); token issuance scaffolding (full Access Control panel deferred to Phase 6)P0010 (security baseline)T101
T108SQLite migration runner; schema for sessions, audit log entries (Postgres parity behind interface)P0009T101
T109Audit log writer: every API call records actor, session id, event type, payload (keys redacted)P1010 (baseline)T108
T110Compose stack deploy/compose/compose.yaml: control-plane + database + browser-pool template + internal baas-net bridgeP0001, 009T102, T103
T111Smoke test harness: make smoke spins compose stack, runs end-to-end navigate→screenshot→terminate flowP1009T106, T110

Phase 1 exit criteria: make smoke green; control-plane image <30 MB; browser sidecar image built and pushed; session create→navigate→screenshot→delete works end-to-end with JWT auth.


Phase 2 — AI Gateway + Capability Pack Substrate (Weeks 5–8)

Goal: OpenAI-compatible gateway live; Capability Pack execution engine usable; first three reference packs shipped.

IDTaskPriADRsDepends on
T201OpenAI-compatible /v1/chat/completions + /v1/models facade routing on provider/model syntaxP0005T107
T202Provider adapters: Anthropic, Google Gemini, OpenAI, Ollama, Deepseek (HTTP clients with retry + connection pooling)P0005T201
T203AES-256-GCM encrypted key store; key never returned in full; rotation API; provider test endpointP0005, 007T108, T201
T204Fallback chain rules engine: {primary, fallback, trigger} with rate-limit / error / timeout triggersP1005T201
T205Pack Execution Engine: input schema validation → session acquire → handler invoke → output schema validation → artifact upload → typed resultP0003, 008T106
T206Closed-set typed error codes enforcement: middleware that maps any uncategorized handler error to nearest defined codeP0008T205
T207Pack registry: in-memory registration + REST POST /api/v1/packs/{name} dispatch + version routing /v{n}P0003, 024T205
T208Built-in pack: browser.screenshot_url (reference pack — validates the whole substrate)P0021T207
T209Built-in pack: web.scrape_spa with JSON Schema-driven extraction and partial-result handlingP0017T207
T210Built-in pack: slides.render (Marp + Chromium → PDF/PPTX/HTML)P1014T104, T207
T211Object store integration (S3-compatible) for pack artifacts; signed URL generationP0014, 015, 018, 021T205
T211aBundle Garage (dxflrs/garage) as the default object store in deploy/compose/compose.yaml; init container runs garage layout assign + garage layout apply on first boot; control plane env wired so make smoke exercises the persistent path end-to-endP0031T211, T110
T211bArtifact TTL janitor: control-plane goroutine scans audit-table pack output references older than HELMDECK_ARTIFACT_TTL (default 7d) and deletes the corresponding objects; per-pack overrides via pack manifestP1031T211, T109
T211cCross-reference ADR 031 from ADRs 014 and 021 (one-line "see ADR 031 for backend choice" addition); update README install path to mention bundled GarageP3031T211a
T212A2A Agent Card endpoint /.well-known/agent.json auto-generated from pack registryP2026T207
T213A2A task endpoint POST /a2a/v1/tasks with SSE streaming for long-running packsP2026T212

Phase 2 exit criteria: weak-model success rate ≥90% on browser.screenshot_url + web.scrape_spa against the MiniMax-M2.7 + Llama 3.2 7B cohort (per RELEASES.md v0.2.0 hard exit gate); AI gateway proxies all five providers; pack registry hot-loads new packs without restart.


Phase 3 — MCP Registry + Bridge + Client Integrations (Weeks 9–10)

Goal: all installed packs callable from Claude Code, Claude Desktop, OpenClaw, Gemini CLI via the bridge.

IDTaskPriADRsDepends on
T301MCP server registry CRUD API; stdio/SSE/WebSocket transport adapters; manifest fetch + cacheP0006T108
T302Built-in MCP server exposing every installed pack as a typed MCP tool (auto-generated from pack registry)P0003, 006T207
T618github.list_issues + github.search — complete GitHub CRUD + search. list_issues filters by state/label/assignee. search queries code/issues/PRs via GitHub search API. Both use vault PAT (optional for public repos).P1034T617
T619git.diff + git.log — agents review changes before committing. diff shows uncommitted changes in a session clone. log shows recent commit history. Both use session exec via _session_id.P1T504a
T620fs.delete — remove a file in a session-local clone path. Same path-safety validation as other fs.* packs (isSafeClonePath + safeJoin).P1T550
T621browser.interact — deterministic multi-step browser automation. Input: array of actions [{action:"navigate",url:"..."},{action:"click",selector:"#btn"},{action:"type",selector:"#input",value:"hello"},{action:"screenshot"},{action:"assert_text",text:"Success"}]. Uses existing chromedp. No LLM needed. Foundation for AI-powered web.test (T807e).P1035T106
T617Core github.* pack set — 4 tools (create_issue, list_prs, post_comment, create_release) using vault-stored PATs via api.github.com. Pure HTTP, no gh CLI dependency.P1034T504
T302bMCP inline image content — image artifacts under a configurable threshold (default 1 MB) returned as type: "image" base64 content blocks in tools/call responses. Only the MCP transport gains this; REST API unchanged. Lets vision-capable LLMs reason about screenshots in one round trip.P1006, 032T302
T613Artifact Explorer UI panel — standalone /artifacts route in the Management UI listing recent artifacts with inline image preview, download button, pack/date filter. Backed by GET /api/v1/artifacts.P1032T601, T211
T302aSSE MCP transport at /api/v1/mcp/sse (GET stream + paired POST endpoint per the MCP SSE spec). Lets containerized clients like OpenClaw connect via URL transport without baking the stdio bridge into their image. Closes the sidecar-pattern gap that left the OpenClaw integration walkthrough blocked.P0006T302
T303helmdeck-mcp bridge binary: stdio MCP server proxying to platform's WebSocket MCP endpoint via HELMDECK_URL + HELMDECK_TOKENP0025, 030T302
T304Bridge version-skew warning: emit deprecation notification on session start when older than platform's min recommendedP1025, 030T303
T305Distribution channels via goreleaser: Homebrew tap (tosin2013/helmdeck), Scoop bucket, GitHub Releases (cosigned)P0030T102, T303
T306npm package @helmdeck/mcp-bridge with postinstall binary downloader from GH ReleasesP1030T305
T307OCI image ghcr.io/tosin2013/helmdeck-mcp (multi-arch) for containerized agentsP1030T305
T308CI smoke matrix: spawn helmdeck-mcp from each of Claude Code, Claude Desktop, OpenClaw, Gemini CLI configs and assert browser.screenshot_url returns a PNGP0025T303, T208
T309"Connect" UI snippets per client (deferred to Phase 6 when UI lands; stub the JSON generators now)P2025T303

Phase 3 exit criteria: all four target clients invoke browser.screenshot_url end-to-end via the bridge in CI; bridge installable via brew install, npx, scoop install, docker run.


Phase 4 — Desktop Actions + Vision Mode (Weeks 11–13)

IDTaskPriADRsDepends on
T401Desktop Actions REST API: screenshot, click, type, key, launch, windows, focus (xdotool/scrot wrappers)P0027T106
T402Built-in pack: desktop.run_app_and_screenshotP1018T401
T403Built-in pack: doc.ocr (Tesseract with language pack support)P1019T207
T404Built-in pack: web.fill_formsuperseded by T503 (CDP cookie injection) + T408 (vision.fill_form_by_label); the "fill a form with a vault credential" capability ships through both020
T405Built-in pack: web.login_and_fetchsuperseded by T504 (http.fetch with ${vault:NAME} substitution) + T503; the substantive auth pattern is the placeholder-token flow, not a dedicated browser-driven login pack016
T406Built-in pack: slides.videodeferred; not on the GA path. Worth revisiting alongside T804 (WebRTC streaming) since the same audio/video pipeline serves both015
T407Vision-mode endpoint POST /api/v1/sessions/{id}/vision/act: screenshot → AI gateway → action loopP1027T201, T401
T408Reference vision packs: vision.click_anywhere, vision.extract_visible_text, vision.fill_form_by_labelP2027T407
T409noVNC live viewer endpoint /api/v1/desktop/vnc-url (baseline; WebRTC in Phase 6+)P2028T401
T410Steel Browser optional integration as alternate browser layer behind SessionRuntime interfaceP3001T103

Phase 4 exit criteria: desktop session screenshots work; web.login_and_fetch succeeds against a test SaaS using a vault credential; vision mode demo on a Canvas-only page.


Phase 5 — Credential Vault + Repo Packs + Hardening (Weeks 14–16)

IDTaskPriADRsDepends on
T501Credential Vault: AES-256-GCM store with separate encryption key, host/path pattern matcher, agent-scope ACL, usage logP0007T108, T203
T502Vault credential types: website login, session cookies, API key, OAuth (with refresh), SSH/gitP0007T501
T503CDP cookie injection at session start (Network.setCookies) and form-autofill fallbackP0007, 016T501, T106
T504HTTP gateway placeholder-token interception: intercept agent egress, swap placeholder for real credential, forwardP0007T501
T505Built-in pack: repo.fetch (URL normalization, vault SSH key, GIT_SSH_COMMAND with accept-new, retries)P0022T501
T506Built-in pack: repo.push (paired with repo.fetch; non-fast-forward → schema_mismatch with detail)P1023T505
T507OneCLI delegation mode: optional config to forward credential resolution to external OneCLIP3007T501
T508Application-layer egress guard: refuses any pack-handler call to a host that resolves to 169.254.169.254/32, RFC 1918 ranges, loopback, IPv6 link-local, or carrier-grade NAT — with DNS rebinding defense (every returned IP must pass). HELMDECK_EGRESS_ALLOWLIST for internal hosts. K8s NetworkPolicy lands separately as T706.P0011T103
T509Sandbox baseline: non-root UID 1000 (helmdeck user in sidecar Dockerfile), cap-drop ALL + cap-add SYS_ADMIN (Chromium namespace sandbox), no-new-privileges, pids-limit 1024 (defaults; override via HELMDECK_PIDS_LIMIT), seccomp defaults to docker's curated profile (override via HELMDECK_SECCOMP_PROFILE)P0011T103
T510OpenTelemetry instrumentation: traces with gen_ai.system, gen_ai.request.model, gen_ai.usage.* on every LLM/MCP/pack span; OTLP exporterP0013T201, T205
T511Trivy CI scan; fail pipeline on CRITICAL findingsP0030T102
T511aGitleaks secret-scan workflow + .gitleaks.toml allowlist. Closes the gap left when T511 was scoped to scanners: vuln,misconfig (secret detection deferred to gitleaks to avoid double-reporting). Runs on every push + PR against main via gitleaks/gitleaks-action@v2 with fetch-depth: 0 so it scans full history. Allowlist covers the stable dev credentials checked into deploy/compose/garage.toml — the file's header comment already documents them as override-in-production.P1030T511
T511bContributor CI-parity: make check target (= vet + -race test + build, exactly what the vet + test + build CI job runs), opt-in .githooks/pre-push wiring via make install-hooks, plus the TestBridgeRoundTrip race fix (wrap shared bytes.Buffer in a test-only safeBuffer with sync.Mutex) + trivy-action pin bump 0.28.00.35.0. Catches CI failures locally before they land in a PR. Production internal/bridge/bridge.go unchanged — the race only existed because the test shared a buffer between the test goroutine and the bridge's background writer.P2030T511

Phase 5 exit criteria: repo.fetch against a private GitHub repo with vault SSH key works end-to-end without agent ever seeing the key; OTel traces visible in a Langfuse instance; egress allowlist blocks metadata IP.


Phase 5.5 — Code Edit Loop (interleaved with Phase 5)

Goal: turn repo.fetch into a working autonomous code-edit workflow by adding the file/git/cmd primitives the LLM needs to actually modify and test code inside a clone.

IDTaskPriADRsDepends on
T550Built-in pack: fs.read (read file from clone with size cap + sha256, path safety via safeJoin)P0022T505
T551Built-in pack: fs.write (write file with mkdir -p for parents, content via stdin)P0022T505
T552Built-in pack: fs.patch (literal search-and-replace, NOT regex; optional occurrence cap; sha256 of result)P0022T550, T551
T553Built-in pack: fs.list (find files under clone path with optional glob, recursive flag, 5000-entry cap)P1022T550
T554Built-in pack: cmd.run (run an arbitrary shell command in a clone path with stdin support; non-zero exit codes are normal pack outcomes)P0022T505
T555Built-in pack: git.commit (stage + commit with helmdeck-agent author env injection; "nothing to commit" maps to invalid_input)P0023T505
T556Built-in pack: http.fetch (placeholder-token demo: ${vault:NAME} substitution in URL/headers/body via the wrapped http.Client; egress-guarded)P0007T504
T557docs/integrations/README.md — index + per-client status matrix (✅ tested & integrated / 🟡 documented, not yet verified / ⚪ planned)P0025T556
T558docs/integrations/claude-code.md — prerequisites, bridge install, client config, Phase 5.5 code-edit-loop walkthrough, troubleshooting; status banner at topP0025T557
T559docs/integrations/claude-desktop.md — same shape as T558P1025T557
T560docs/integrations/openclaw.md — same shape as T558P1025T557
T561docs/integrations/nemoclaw.md — same shape as T558P1025T557
T562docs/integrations/gemini-cli.md — same shape as T558P1025T557
T563docs/integrations/hermes-agent.md — same shape as T558P2025T557
T564scripts/validate-clients.sh — manual helper: boots compose stack, prints /api/v1/connect/{client} snippets + a copy-pasteable JSON-RPC scenario for the Phase 5.5 code-edit loop. Operator runs the scenario by hand against each client. No pass/fail automation.P1025T557
T565Walk the Phase 5.5 code-edit loop against Claude Code end-to-end against a real private GitHub repo; flip docs/integrations/claude-code.md banner + docs/integrations/README.md matrix row to ✅ with date + Helmdeck version. This is the actual v0.5.5 exit gate — T557–T564 are scaffolding for it.P0025T558, T564
T570scripts/install.sh one-command bootstrap. Preflight (docker, node≥20, go≥1.26, make, openssl, curl) with platform-aware install hints; idempotent secret generation into deploy/compose/.env.local (chmod 600); build pipeline (make web-deps && web-build && build && sidecar-build); docker compose up -d --wait; healthcheck poll; post-install summary block; --reset and --no-build flags. Side effects: make install target, compose.yaml env_file: .env.local wiring (so vault/keystore/admin secrets actually reach the container), .gitignore exclusion of .env* with exception for .env.example, README Quick Start rewrite. Verified end-to-end on a fresh Ubuntu 24.04 multipass VM (missing-prereq path + happy path + idempotency + --reset).P0009T211a, T501

Phase 5.5 exit criteria: every client listed in docs/integrations/ has a setup guide, and at least Claude Code is marked ✅ tested & integrated by walking through the full repo.fetchfs.listfs.readfs.patchcmd.rungit.commitrepo.push loop against a real private GitHub repo, with the SSH key never in the LLM's context window and every step audit-logged.


Phase 6 — Management UI (Weeks 17–20)

IDTaskPriADRsDepends on
T601React/Tailwind/shadcn UI shell embedded in Go binary; JWT login flowP0002T107
T602Dashboard panel: metric cards + activity feed + Recharts memory chartP1T601, T109
T603Browser Sessions panel: data table, New Session modal, View Logs drawer, Terminate confirmP0004T601, T105
T604AI Providers panel: provider cards, Configure modal, Test Connection, Routing Rules tableP0005T601, T203
T605MCP Registry panel: server table, Add Server multi-step modal, Tool InspectorP0006T601, T301
T606Capability Packs panel (the killer feature): list grouped by namespace, Overview/Schema/Test Runner tabsP0003, 024T601, T207
T202aWire keystore-stored provider keys into gateway.Registry at startup + on every key mutation (hot reload). Adds HELMDECK_OPENROUTER_API_KEY env-var fast path for OpenAI-compatible aggregators not yet modeled in the keystore schema. Closes the gap that left v0.6.0 with a non-functional /v1/chat/completions despite T202 being marked complete. Post-v0.8.0: community PRs extended LoadCustomOpenAIProviders with Groq (PR #45, issue #35) and Mistral (PR #47, issue #36) adapters, both riding the same HELMDECK_{PROVIDER}_API_KEY[_FILE] / _BASE_URL / _MODELS env-var contract. Local Ollama (no key) added on the same pattern.P0005T203
T607Model Success Rates tab with per-model breakdown, 80% threshold highlight, "Tighten Schema" diff viewP0003, 008, 024T606, T510
T608Pack Authoring UImoved to Phase 8 (see row in Phase 8 table); depends on T801 (WASM Executor) or a composite-pack runtime, neither of which is on the v0.6.0 critical path024T606, T801
T609Security Policies panel: Network/Sandbox/Access Control tabsP1011T601, T508
T610Credential Vault panel: credentials table, Add Credential modal, Session Cookie import tool, Usage Log tabP1007T601, T501
T611Audit Logs panel: filter bar, infinite-scroll table, Details drawer with redacted JSON payloadP1013T601, T109
T612"Connect" UI buttons for Claude Code / Claude Desktop / OpenClaw / Gemini CLI emitting OS-detected one-linersP1025, 030T601, T309
T602aRecharts memory chart on Dashboard panel: time-series of process_resident_memory_bytes from the control-plane Prometheus scrapeP2T602
T603aNew Session modal on Browser Sessions panel: form-based session creation with shm_size, timeout, maxTasks, mem/cpu limitsP1004T603
T604aAdd/Rotate provider key modal on AI Providers panel: encrypted-at-rest write to keystore, hot reload via gateway.HydrateP1005T604, T202a
T605aAdd Server modal on MCP Registry panel: stdio/SSE/WebSocket transport pickers, server health probeP1006T605
T606aPack Test Runner tab on Capability Packs panel: form derived from input schema, dispatch to POST /api/v1/packs/{name}, render typed output + artifactsP0003, 008, 024T606
T609aSecurity Policies panel — edit + reload-config: write-through to HELMDECK_EGRESS_ALLOWLIST etc.; POST /api/v1/security/reload warm-reloads guards without stack restartP2011T609
T610aAdd Credential modal + Usage Log tab on Credential Vault panel: typed credential entry, masked value reveal-on-click, scoped ACL editorP1007T610
T612aOS-detected one-liners on Connect Clients panel: macOS/Linux/Windows command snippets per client (Claude Code, Claude Desktop, OpenClaw, Gemini CLI, Hermes Agent), copy buttonsP2025, 030T612
T615GitHub PAT setup in scripts/install.sh — optional interactive prompt stores token in vault as github-token so the GitHub pack family works out-of-boxP1007T501, T570
T616GitHub webhook listener at POST /api/v1/webhooks/github — HMAC-SHA256 signature validation, async pack dispatch per event rules (push, pull_request initially)P2033T207, T617

Phase 6 exit criteria: every read-only Phase 6 panel (Dashboard, Sessions, AI Providers, MCP Registry, Capability Packs, Security Policies, Credential Vault, Audit Logs, Connect Clients) ships against a real backend with success-rate visibility (T607). Pack authoring (T608) is deferred to Phase 8 — operators observe and dispatch packs in v0.6.0; they author them in v1.x once a sandboxed runtime (T801) lands.


Phase 6.5 — MCP Server Hosting & Pack Evolution

Goal: validate the "host, don't rebuild" pattern from ADR 035 by bundling third-party MCP servers (Playwright MCP) and integration services (Firecrawl, Docling) into the helmdeck stack, plus add native computer-use tool routing and three composite/pipeline packs that exploit the new substrate. Ships as v0.8.0.

IDTaskPriADRsDepends on
T807aBundle Playwright MCP (@playwright/mcp) into the browser sidecar Dockerfile (Node 20 + PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1); auto-launch via npx --cdp-endpoint=http://127.0.0.1:9222 --port 8931 after Chromium is live; surface as playwright_mcp_endpoint on the session REST. Opt-out via HELMDECK_PLAYWRIGHT_MCP_ENABLED=false.P0035T104, T105
T807bAdd Firecrawl as an optional compose overlay (deploy/compose/compose.firecrawl.yml); new web.scrape pack (no selectors, returns clean markdown). Env-gated on HELMDECK_FIRECRAWL_ENABLED. Egress guard wraps the target URL before the upstream call.P1035T207, T508
T807cAdd Docling as an optional compose overlay (compose.docling.ymlquay.io/docling-project/docling-serve:latest with named model-cache volume); new doc.parse pack supersedes doc.ocr for layout/tables/multi-format. Env-gated on HELMDECK_DOCLING_ENABLED.P1035T207
T807dbrowser-use / Skyvern wrappersuperseded by T807f. Native computer-use tool schemas now ship across all three frontier providers; wrapping a Python agent loop adds no value.035
T807eweb.test — natural-language browser testing via Playwright MCP accessibility tree. Plan-step loop: snapshot → ask gateway LLM for one tool call → dispatch via pwmcp → re-snapshot until done/fail/max_steps. Egress-guarded mid-test navigations.P1035T807a, T201
T807fNative computer-use tool routing (supersedes T807d). Six work packages: gateway tool-use plumbing across Anthropic/OpenAI/Gemini, eight new desktop REST primitives, vision.StepNative cross-provider executor, EventComputerUse audit + replay, AgentStatus on noVNC banner, ADR 035 revision. JSON-prompt fallback for non-native providers.P0035T407, T201, T510
T622research.deep — Firecrawl-backed deep research composite pack: single /v1/search call with scrapeOptions.formats=["markdown"] does search + per-source scrape in one round trip; results synthesized by the gateway model with inline URL citations. Limit defaults 5, hard cap 10.P2035T807b, T201
T622arepo.fetch context envelope (tree, readme, entrypoints, doc_hints, signals) so agents orient on the first turn; companion opt-in repo.map pack produces a ctags-derived structural symbol map under a token budget. Closes the "empty repo" false positive when README.adoc isn't auto-detected.P1022, 036T505
T623content.ground — link grounding for blog posts: extract verbatim claims from a markdown file via the gateway model, search Firecrawl for authoritative sources, write [claim](url) annotations back into the file via literal substring substitution. Hallucinated claims (text not in file) are skipped, not patched.P2035T622, T552
T406slides.narrate — narrated MP4 video from Marp decks (moved from Phase 4 with expanded scope). Per-slide PNGs → ElevenLabs TTS (vault elevenlabs-key) → ffmpeg segment assembly with optional fades → LLM-generated YouTube metadata (title, M:SS timestamps, tags). Degrades gracefully when key/model is absent.P2014T210, T501

Phase 6.5 exit criteria: v0.8.0 tagged with 36 packs total; scripts/validate-phase-6-5.sh passes against a fresh stack including the Firecrawl + Docling overlays; native computer-use round-trip works against at least one of Anthropic / OpenAI / Gemini.


Phase 7 — Kubernetes / Helm / Production Hardening (Weeks 21–22)

IDTaskPriADRsDepends on
T701client-go SessionRuntime backend: spawn pods in baas-sessions namespace via K8s APIP0009T103
T702Helm chart charts/baas-platform/: control-plane Deployment x2, PDB, Service, Ingress, ServiceAccount + Role + RoleBinding scoped to baas-sessionsP0009T701
T703PostgreSQL StatefulSet sub-chart (Bitnami); database.external.enabled toggleP0009T108, T702
T704Session pod template: restartPolicy: Never, automountServiceAccountToken: false, seccomp Localhost profile, /dev/shm emptyDir medium: Memory sizeLimit: 2GiP0004, 011T701
T705NetworkPolicy 1: allow baas-systembaas-sessions on port 9222P0011T702
T706NetworkPolicy 2: restrict session pod egress, block 169.254.169.254/32 + 10.0.0.0/8, render allowlist from Security PoliciesP0011T508, T702
T707KEDA ScaledObject reading baas_queued_session_requests and baas_active_sessions / baas_pool_capacity from Prometheus; thresholds 1 and 0.8P0010T510, T702
T708browser-pool-warmup Deployment maintaining N pre-initialized session pods; control plane claim/release protocolP0010T707
T709isolation.level Helm value: standard (Docker default), enhanced (gVisor runsc RuntimeClass), maximum (firecracker-containerd RuntimeClass)P1011T704
T710cert-manager Certificate resource + Ingress-NGINX TLS termination; tls.enabled toggleP1009T702
T711OTel Collector DaemonSet (K8s tier) / sidecar (Compose tier); OTLP forwarderP1013T510
T712External Secrets Operator integration; vault.externalSecrets.enabled toggleP2007T501, T702
T713Argo CD reference application manifest in deploy/gitops/P2009T702
T714Load test: 100 concurrent sessions, 24 h soak, validate ≤150 MB control plane footprint and ≤5 s recoveryP0010T708
T715External security audit; remediate findings before GAP0011T714

Phase 7 exit criteria: Helm install on a fresh GKE/EKS cluster passes the same smoke matrix as Compose; KEDA scales pool under synthetic load; gVisor tier passes the smoke matrix; security audit clean.


Phase 8 — Innovation Backlog (Post-GA, Weeks 23+)

These are tracked but not on the GA critical path.

IDTaskPriADRs
T801WASM Executor subsystem (wasmtime-go); WASI capability inspection; .wasm pack handler runtimeP1012, 024
T608Pack Authoring UI: schema editor with live validation, handler editor, Test Runner, Publish (moved from Phase 6 — depends on T801 for a sandboxed handler runtime)P1024
T802Four-tier Memory API: Working (in-process) + Episodic (Redis) + Semantic (pgvector) + Procedural (read-only)P1029
T803Procedural-memory → Pack promotion UI flow ("Pack Candidates")P2024, 029
T804WebRTC live session streaming via pion/webrtc; LiveKit SFU optional path; bidirectional control data channelP2028
T805Audio capture for desktop sessions (PulseAudio → WebRTC second track)P3028
T806WebMCP detection on visited pages; preferential routing through navigator.modelContext when availableP2027
T807Pre-packaged Chrome DevTools MCP and Playwright MCP registry entries pointing at managed sessionsP2006
T808Firecracker isolation tier productionization (bare-metal node pool docs, networking model)P2011
T809Lightpanda alternate browser engine evaluationP3001
T810Pack marketplace registry model — index.yaml catalog schema, helmdeck-pack.yaml manifest, cosign trust, HELMDECK_MARKETPLACE_URL env var, catalog refresh endpointP1034
T811command handler type — subprocess packs in any language (stdin JSON / stdout JSON), sandboxed with same egress guard + audit logging as built-in packsP1034
T812helmdeck pack install/uninstall CLI commands + POST /api/v1/marketplace/install REST endpoint with hot-load (no restart)P1034
T813Marketplace UI panel — /marketplace route with browse-by-category, search, pack detail cards, install/uninstall buttons, trust badges (Core / Signed / Unsigned)P1034
T814Community marketplace repo (tosin2013/helmdeck-marketplace) — initial catalog with contribution guide, CI for manifest validation, cosign signing in release pipelineP2034
T815Pack ratings + install counts — requires marketplace-web frontend repo, user accounts (GitHub OAuth), star/rating system, install analytics behind SessionRuntime interfaceP3001
T816MCP Server Hosting framework — generic helmdeck mcp install <server> for community MCP servers with sandboxed execution; converges with the marketplace (T810) so any catalog entry that ships an MCP server, not just a pack manifest, can be hosted by helmdeck rather than rebuilt as a packP2035

Critical Path

T101 → T102 → T103 → T105 → T106 → T205 → T207 → T208 → T302 → T303 → T308
↓ ↓
T201 → T202 → T203 → T501 → T504 → T505 │
↓ │
T508 → T701 → T702 → T714 → T715 → GA

The hard sequence is: Go scaffolding → session runtime → CDP → pack engine → reference pack → MCP server → bridge → client smoke matrix; in parallel: AI gateway → vault → repo packs; converging on K8s + load test + audit before GA.

Dependency-Free Parallel Tracks

These can be staffed independently from week 1:

  • UI track (T601 onward) — Phases 1–5 are now shipped; the REST surface the UI consumes is stable. UI track is the next active workstream rather than a parallel one.
  • Helm chart track (T702, T703, T705, T706) — once client-go SessionRuntime lands.
  • Distribution track (T305, T306, T307) — once goreleaser config exists. ✅ shipped in v0.3.0.
  • Documentation track — recipes for each integrated client (ADR 025) can be drafted as soon as the bridge contract is frozen.

Open Questions to Resolve Before Phase 1 Kickoff

  1. Object store choice for pack artifacts: bundled MinIO vs. require external S3? Resolved by ADR 031 (2026-04-08): bundle Garage as the Compose default; treat the storage layer as a pluggable S3 client so any external backend is a first-class option; never bundle MinIO (upstream archived 2026-02). Tracked by T211a/T211b/T211c below.
  2. Which weak open-weight models (and at which quantizations) form the reference benchmark cohort for the Model Success Rates SLO?
  3. Tenant boundary semantics for ADR 029 semantic memory — single-tenant only at GA, multi-tenant later?
  4. License choice for the platform repo Resolved 2026-04-08: Apache License 2.0, picked specifically to maximize external contributions to the Capability Pack catalog. Apache 2.0 is the license every adjacent ecosystem (Kubernetes, OpenTelemetry, Helm, gRPC, Argo CD, Trivy, the Anthropic / OpenAI SDKs, chromedp, the Docker SDK) already uses, which means corporate legal teams have pre-approved contributions to it and vendors can ship official packs for their own products without dual-license friction. Patent grant via Section 3 covers the Chromium / ssh / git / vault patent surface. See LICENSE, NOTICE, and CONTRIBUTING.md for the full text and contribution flow.