Helmdeck — GitHub Milestones & Issue Checklists
Drop-in source for gh issue create and GitHub Projects. Each phase = one milestone. Each task ID from TASKS.md = one issue. Copy a section into gh milestone create + gh issue create --milestone ....
Milestone: v0.1 — Core Infrastructure (Phase 1)
Target: Week 4 · Exit: make smoke green end-to-end on Compose · Ships in: v0.1.0
- T101 Bootstrap Go module + repo layout (
cmd/control-plane,cmd/helmdeck-mcp,internal/) - T102 goreleaser + GitHub Actions: build matrix, cosign, distroless image to ghcr.io
- T103
SessionRuntimeinterface + Docker SDK backend - T104 Browser sidecar Dockerfile (Chromium, Marp, Tesseract, ffmpeg, xdotool, scrot, Xvfb, XFCE4, noVNC)
- T105 Session lifecycle REST + watchdog
- T106 CDP integration via
chromedp - T107 JWT auth middleware
- T108 SQLite migrations + Postgres-compatible schema
- T109 Audit log writer
- T110 Compose stack
deploy/compose/compose.yaml - T111
make smokeend-to-end harness
Milestone: v0.2 — AI Gateway & Pack Substrate (Phase 2)
Target: Week 8 · Exit: ≥90% weak-model success on browser.screenshot_url + web.scrape_spa against the MiniMax-M2.7 + Llama 3.2 7B cohort · Ships in: v0.2.0
- T201 OpenAI-compatible
/v1/chat/completions+/v1/models - T202 Provider adapters: Anthropic, Gemini, OpenAI, Ollama, Deepseek
- T203 AES-256-GCM encrypted key store + rotation
- T204 Fallback chain rules engine
- T205 Pack Execution Engine
- T206 Closed-set typed error code enforcement
- T207 Pack registry + REST dispatch + version routing
- T208 Built-in pack:
browser.screenshot_url - T209 Built-in pack:
web.scrape_spa - T210 Built-in pack:
slides.render - T211 Object store integration + signed URLs
- T211a Bundle Garage as default object store in Compose stack (ADR 031)
- T211b Artifact TTL janitor (ADR 031)
- T211c Cross-reference ADR 031 from ADRs 014/021 + README (ADR 031)
- T212 A2A Agent Card endpoint
- T213 A2A task endpoint with SSE
Milestone: v0.3 — MCP Bridge & Client Integrations (Phase 3)
Target: Week 10 · Exit: four-client smoke matrix green in CI · Ships in: v0.3.0
- T301 MCP server registry CRUD + transport adapters
- T302 Built-in MCP server exposing all packs
- T302a SSE MCP transport at
/api/v1/mcp/sse(unblocks the sidecar pattern: containerized clients like OpenClaw point at the URL transport instead of having to bake the helmdeck-mcp stdio bridge into their image. PackServer is transport-agnostic so the SSE handler is a thin adapter; WS transport untouched. JWT-protected via the same router middleware as every other /api/v1/ route.)* - T303
helmdeck-mcpbridge binary - T304 Bridge version-skew warning
- T305 Distribution: Homebrew tap + Scoop bucket + GH Releases (cosigned)
- T306 npm package
@helmdeck/mcp-bridge - T307 OCI image
ghcr.io/tosin2013/helmdeck-mcp - T308 CI smoke matrix: Claude Code · Claude Desktop · OpenClaw · Gemini CLI
- T309 "Connect" UI snippet generators (stubs)
Milestone: v0.4 — Desktop & Vision (Phase 4)
Target: Week 13 · Exit: vault-backed web.login_and_fetch + vision demo on Canvas page · Ships in: v0.4.0
- T401 Desktop Actions REST API (xdotool/scrot wrappers)
- T402 Built-in pack:
desktop.run_app_and_screenshot - T403 Built-in pack:
doc.ocr -
T404 Built-in pack:— superseded by T503 (CDP cookie injection) + T408 (web.fill_formvision.fill_form_by_label) -
T405 Built-in pack:— superseded by T504 (web.login_and_fetchhttp.fetchwith${vault:NAME}) + T503 -
T406 Built-in pack:— moved to Phase 6.5 asslides.videoslides.narratewith expanded scope (ElevenLabs TTS + YouTube metadata). See T406 under Phase 6.5 below. - T407 Vision-mode endpoint
- T408 Reference vision packs
- T409 noVNC live viewer baseline
-
T410 Steel Browser integration— deferred indefinitely. Playwright MCP (T807a) and native computer-use tool routing (T807f) cover the browser automation surface. Steel Browser adds marginal value over the existing stack.
Milestone: v0.5 — Vault, Repo Packs & Hardening (Phase 5)
Target: Week 16 · Exit: repo.fetch against private GitHub via vault SSH key; OTel traces in Langfuse · Ships in: v0.5.0
- T501 Credential Vault (AES-256-GCM + ACL + usage log)
- T502 Credential types: login, cookies, API key, OAuth, SSH/git
- T503 CDP cookie injection + form-autofill fallback
- T504 HTTP gateway placeholder-token interception
- T505 Built-in pack:
repo.fetch - T506 Built-in pack:
repo.push -
T507 OneCLI delegation mode— deferred indefinitely. The MCP bridge (T303) + SSE transport (T302a) + client integrations (T557–T563) already serve every client helmdeck targets. OneCLI adds a proprietary CLI layer with no clear demand signal. - T508 NetworkPolicy egress allowlist + metadata IP block
- T509 Sandbox baseline (non-root, drop caps, seccomp)
- T510 OpenTelemetry GenAI instrumentation
- T511 Trivy CI scan gate (2026-04-21 hardening: action pin bumped
aquasecurity/trivy-action@0.28.0→0.35.0after upstream yanked the older tag;scanners: vuln,misconfigadded to both steps so the 0.35 default-on secret scanner doesn't false-positive on UI placeholders and dev-only compose credentials. Secret detection deferred to the new gitleaks workflow in T511a.) - T511a Gitleaks secret-scan workflow +
.gitleaks.tomlallowlist (closes the secret-detection gap left by T511's scope tightening..github/workflows/gitleaks.ymlrunsgitleaks/gitleaks-action@v2on every push + PR againstmainwithfetch-depth: 0so the scanner walks full commit history. Allowlist coversdeploy/compose/garage.toml— three stable dev credentials the file's own header already documents as override-in-production; local scan across 142 commits is clean. No license needed for the free tier; workflow file flags what changes if helmdeck ever moves under an org.) - T511b Contributor CI-parity tooling (new
make checktarget wraps vet +-racetest + build — the exact three gates CI'svet + test + buildjob runs — so contributors can verify locally before pushing. Newmake install-hookswirescore.hooksPathat.githooks/, enabling an opt-inpre-pushhook that runsmake checkautomatically.TestBridgeRoundTriprace fix: wrapped the test-side stdout/stderrbytes.Bufferin async.Mutex-guardedsafeBuffer— productioninternal/bridge/bridge.gountouched since the real binary writes to kernel-serialized file descriptors, not a shared buffer.CONTRIBUTING.mdupdated with both commands.)
Milestone: v0.5.5 — Code Edit Loop (Phase 5.5)
Target: alongside Phase 5 · Exit: every client in docs/integrations/ has a setup guide, and at least Claude Code is marked ✅ tested against the Phase 5.5 code-edit loop (repo.fetch → fs.* → cmd.run → git.commit → repo.push) · Ships in: v0.5.0 / v0.5.1 (interleaved)
- T550 Built-in pack:
fs.read(read file from clone) - T551 Built-in pack:
fs.write(write file to clone) - T552 Built-in pack:
fs.patch(literal search-and-replace) - T553 Built-in pack:
fs.list(find files under clone path) - T554 Built-in pack:
cmd.run(run an arbitrary command in clone) - T555 Built-in pack:
git.commit(stage + commit changes) - T556
http.fetchplaceholder-token demo pack (landed with T504) - T557
docs/integrations/README.md— index + per-client status matrix (✅ tested / 🟡 documented / ⚪ planned) - T558
docs/integrations/claude-code.md— setup + Phase 5.5 loop walkthrough (🟡 — awaiting end-to-end walk to flip to ✅) - T559
docs/integrations/claude-desktop.md— setup + Phase 5.5 loop walkthrough (🟡) - T560
docs/integrations/openclaw.md— setup + Phase 5.5 loop walkthrough (🟡; also correctedconnect.goopenclaw shape to real~/.openclaw/openclaw.json) - T561
docs/integrations/nemoclaw.md— wrapper over openclaw.md with sandbox-specific notes; NemoClaw reuses OpenClaw's MCP schema inside the sandbox so it is intentionally not a separateconnect.gotarget (🟡) - T562
docs/integrations/gemini-cli.md— setup + Phase 5.5 loop walkthrough (🟡) - T563
docs/integrations/hermes-agent.md— setup + Phase 5.5 loop walkthrough; addedhermes-agentcase toconnect.go(YAML config,format: "yaml"field) (🟡) - T564
scripts/validate-clients.sh— manual helper that boots the stack and prints connect snippets + a copy-pasteable JSON-RPC code-edit-loop scenario (no pass/fail automation) - T565 Walk the Phase 5.5 code-edit loop against OpenClaw end-to-end and flip
docs/integrations/claude-code.md+README.mdmatrix to ✅ — the actual milestone exit gate - T570
scripts/install.sh— one-command bootstrap on a fresh box. Verified end-to-end on the dev box (all four scenarios: happy path, login round-trip, idempotent re-run,--resetrotates password). Surfaced and fixed nine pre-existing wiring bugs in compose.yaml / garage.toml / garage-init / Dockerfiles along the way. Multipass VM verification still recommended before tagging v0.6.0.
Milestone: v0.6 — Management UI (Phase 6)
Target: Week 20 · Exit: every read-only Phase 6 panel ships against a real backend; pack authoring (schema editor + handler runtime + publish) is deferred to Phase 8 alongside T801 (WASM Executor) — see T608 below · Ships in: v0.6.0
- T601 React/Tailwind/shadcn shell + JWT login
- T602 Dashboard panel (stat cards + status table; Recharts memory chart in T602a)
- T603 Browser Sessions panel (read-only list; New Session modal in T603a)
- T604 AI Providers panel (read-only key list backed by GET /api/v1/providers/keys; Add/Rotate modal in T604a)
- T605 MCP Registry panel (read-only list; Add Server modal in T605a)
- T606 Capability Packs panel (read-only list grouped by namespace; Test Runner in T606a)
- T202a Wire provider adapters into the gateway registry at startup (gap discovered while preparing OpenClaw validation: T202 shipped the adapter code but the integration step — instantiating each adapter from a stored key and registering it with
gateway.Registry— was never wired incmd/control-plane/main.go. Without this fix/v1/modelsreturned empty,/v1/chat/completionsalways 404'd, and the T607 success-rate panel could never show data. Fix: newinternal/gateway/hydrate.goreads the keystore at boot and on every key add/rotate/delete (hot reload), plus an env-var fast path for OpenAI-compatible aggregators like OpenRouter viaHELMDECK_OPENROUTER_API_KEY. 2026-04-21: community PRs extendedLoadCustomOpenAIProviderswith Groq (PR #45 by @Dev-31, issue #35,internal/gateway/hydrate_groq.go) and Mistral (PR #47, resolved from @vijit-vishnoi's PR #46, issue #36,internal/gateway/hydrate_mistral.go) adapters — both ride the sameHELMDECK_{PROVIDER}_API_KEY[_FILE]/_BASE_URL/_MODELSenv-var contract, both register under{provider}/prefix in/v1/models, both default to a single sensible upstream model.) - T607 Model Success Rates tab (provider_calls table written by gateway dispatch on every success/error path; GET /api/v1/providers/stats aggregates by (provider, model) over a configurable window; rendered as a second section on the AI Providers panel)
-
T608 Pack Authoring UI (schema editor + Go/WASM handler + publish)— deferred to Phase 8, clustered with T801 (WASM Executor) and T803 (Procedural→Pack promotion). Today the pack registry is in-process and has no publish surface; building one means landing either a sandboxed code runtime (WASM, T801) or a composite-pack JSON runtime first. Neither is on the v0.6.0 critical path. Read-only Capability Packs panel (T606) ships in v0.6.0; authoring lands in v1.x. - T609 Security Policies panel (read-only snapshot of egress allowlist + sandbox baseline + auth + telemetry; backed by new GET /api/v1/security; edit + reload-config in T609a)
- T610 Credential Vault panel (read-only list; Add Credential modal + Usage Log in T610a)
- T611 Audit Logs panel (GET /api/v1/audit + filters: event_type / severity / actor / from / to / limit; React panel replaces stub)
- T612 Connect Clients panel
- T504 repo.fetch + repo.push HTTPS clone/push support with vault-stored PAT via GIT_ASKPASS (public repos need no credential; private repos pass
"credential":"vault-name") - T504a Session pinning via
_session_idinput field — repo.fetch preserves session for follow-on packs (fs.*, cmd.run, git.commit, repo.push) to reuse - T615 GitHub PAT setup in
scripts/install.sh— optional interactive prompt stores token in vault asgithub-token - T616 GitHub webhook listener at
POST /api/v1/webhooks/github— HMAC-SHA256 validated, async pack dispatch per event rules (ADR 033; Phase 1: push + pull_request, env-var rules) - T617 Core
github.*pack set —github.create_issue,github.list_prs,github.post_comment,github.create_releaseusing vault-stored PATs viaapi.github.comREST (ADR 034) - T618
github.list_issues+github.search— complete the GitHub CRUD + search set so agents can read and search issues/code, not just create them - T619
git.diff+git.log— agents review what changed before committing - T620
fs.delete— remove files in a session-local clone path - T621
browser.interact— deterministic multi-step browser automation (navigate, click, type, scroll, screenshot, assert_text). Uses existing chromedp. The building block for AI-poweredweb.testin Phase 7. - T302b MCP inline image content — pack artifacts under 1 MB returned as
type: "image"base64 content blocks intools/callresponses so vision-capable LLMs can see screenshots in one round trip (ADR 032) - T613 Artifact Explorer UI panel — standalone
/artifactsroute in the Management UI with image preview, download button, pack/date filters, backed byGET /api/v1/artifacts(ADR 032) (per-client cards with snippet + copy button for claude-code, claude-desktop, openclaw, gemini-cli, hermes-agent; OS-detected one-liners in T612a)
Milestone: v1.0 — Kubernetes & GA (Phase 7)
Target: Week 22 · Exit: Helm install on fresh GKE/EKS passes smoke; security audit clean · Ships in: v1.0.0
- T701
client-goSessionRuntime backend - T702 Helm chart
charts/baas-platform/ - T703 PostgreSQL StatefulSet sub-chart
- T704 Session pod template (seccomp, shm, restartPolicy: Never)
- T705 NetworkPolicy: control-plane → sessions on 9222
- T706 NetworkPolicy: session egress restriction
- T707 KEDA ScaledObject on custom metrics
- T708
browser-pool-warmupDeployment + claim protocol - T709
isolation.levelHelm value (standard/enhanced/maximum) - T710 cert-manager + Ingress-NGINX TLS
- T711 OTel Collector DaemonSet/sidecar
- T712 External Secrets Operator integration
- T713 Argo CD reference manifest
- T714 Load test (100 concurrent, 24h soak)
- T715 External security audit
Milestone: v0.8 — MCP Server Hosting & Pack Evolution (Phase 6.5) ✅
Status: Complete. v0.8.0 tagged. 36 packs ship. scripts/validate-phase-6-5.sh is the validation harness. · Ships in: v0.8.0
This phase validated the "host, don't rebuild" architecture from ADR 035 and added native computer-use tool routing (T807f), narrated video generation (T406), and two composite packs (research.deep, content.ground). The container topology: Playwright MCP in the sidecar (shares Chromium), Firecrawl + Docling as separate optional compose services, ElevenLabs as a cloud TTS API with vault-stored key.
- T807a Bundle Playwright MCP (
@playwright/mcp) in the browser sidecar Dockerfile; auto-register when a session starts (ADR 035) (sidecar.Dockerfile layer 4b installs Node 20 +@playwright/mcp@latestglobally withPLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1so the Playwright postinstall doesn't pull ~200 MB of bundled Chromium that would never be used — the system chromium from layer 2 is the only browser. Entrypoint launchesnpx @playwright/mcp --cdp-endpoint http://127.0.0.1:9222 --host 0.0.0.0 --port 8931 --headless --no-sandboxafter Chromium is live, so Playwright MCP attaches to the same browser process the existingbrowser.*chromedp packs use instead of launching its own — one Chromium, one cookie jar, shared state. Auto-registration surfaces as a newPlaywrightMCPEndpointfield onsession.Sessionpopulated by the Docker runtime from container inspect, exposed on the session REST API asplaywright_mcp_endpoint(http://<container-ip>:8931/mcp, matching upstream's standalone--portmount point); there's no entry in the external/api/v1/mcp/serversregistry because that's for operator-configured MCP servers, not auto-launched sidecar children. Opt-out viaHELMDECK_PLAYWRIGHT_MCP_ENABLED=false— handled in both the entrypoint (skips the npx launch) and inbuildPlaywrightMCPEndpoint(returns empty string so downstream packs see the disabled state cleanly instead of connecting to a closed port). 4 new unit tests cover happy path, opt-out, typo-tolerant true, and the no-IP edge case.) - T807b Add Firecrawl as an optional compose service (
HELMDECK_FIRECRAWL_ENABLED=true); newweb.scrapepack — no selectors, returns clean markdown (ADR 035) (overlay filedeploy/compose/compose.firecrawl.ymlbrings up firecrawl + playwright-service + redis on baas-net; pack registered unconditionally but handler gates on the env var so operators who haven't enabled the overlay get an actionable error pointing at the exact toggle; target URL is run through the egress guard before the Firecrawl call so the sidecar can't be used as an SSRF pivot to reach cloud metadata; 9 table-driven tests cover happy path, disabled-by-default, format whitelist, egress block, upstream 500, success=false, and empty-markdown edge cases) - T807c Add Docling as an optional compose service (
HELMDECK_DOCLING_ENABLED=true); newdoc.parsepack — full document understanding (PDF layout, tables, multi-format, OCR) replacingdoc.ocr(ADR 035) (overlay filedeploy/compose/compose.docling.ymlbrings upquay.io/docling-project/docling-serve:lateston baas-net with a named model-cache volume so cold restarts drop from ~45s to ~5s; pack accepts eithersource_url(http_sources) orsource_b64+filename(file_sources) and hitsPOST /v1/convert/source; target URL is run through the egress guard so Docling can't be coerced into pulling cloud metadata, file sources skip the guard since bytes are inline; closed-set output formats (md/text/html) withmdalways force-included so the output schema's requiredmarkdownfield stays populated;partial_successpasses through unchanged whilefailure/skippedsurface ashandler_failedwith Docling's own error list;doc.ocrstays in the catalog as the lightweight Tesseract-only fallback; 15 table-driven tests cover http/file happy paths, disabled-by-default, exactly-one-source rule, invalid-base64, format whitelist,do_ocr=falseround-trip, egress guard both ways, upstream 500, status=failure, partial_success, and empty markdown) -
T807d browser-use / Skyvern wrapper— superseded by T807f. Mid-planning research showed that all three frontier providers (Anthropic, OpenAI, Google) now ship native computer-use tool schemas (2026), all client-runtime. Wrapping browser-use or Skyvern would embed a Python agent loop inside helmdeck's Go pack engine for functionality the models already provide natively. T807f upgrades helmdeck's existing vision.* + desktop sidecar to speak the provider-native schemas directly — same capability, no new runtime, vault-aware credential safety, cross-provider out of the box. - T807f Native computer-use tool routing + observability hooks (ADR 035, supersedes T807d) (six work packages: A gateway.ChatRequest.Tools/ToolChoice + ContentPart tool_use/tool_result + Anthropic/OpenAI/Gemini adapter translation with provider-specific wire formats, B eight new desktop REST primitives (double_click, triple_click, drag, scroll, modifier_click, mouse_move, wait, zoom) covering the full Claude computer_20251124 / Gemini computer-use-preview action vocabulary, C vision.StepNative — one iteration of screenshot→ChatRequest with Tools=[computer]+ToolChoice=any→parse tool_use→dispatch via xdotool, routing via SupportsNativeComputerUse for Anthropic/OpenAI/Gemini with JSON-prompt fallback for Ollama/Deepseek, ComputerUseAction expanded internal type with provider-aware parsers including Gemini 0-1000 normalized coordinate scaling, D EventComputerUse audit constant + per-step screenshot artifact upload to Garage S3 for replay via the /artifacts panel, E AgentStatus field on VNCInfo + POST /api/v1/desktop/agent_status endpoint for noVNC witness-mode banner overlay, F docs + ADR 035 revision. Innovation angles: cross-provider schema abstraction (swap model field, same desktop), vault-aware typing (model never sees credentials), audit-backed replay, live human observability via noVNC. ~3200 lines across gateway/api/vision/packs/audit + 80 new tests.)
- T807e
web.test— natural language browser testing via Playwright MCP accessibility tree (ADR 035) (newinternal/pwmcp/package is a narrow streamable-HTTP client for @playwright/mcp —Initializecaptures theMcp-Session-Idheader and replays it on every follow-up,ToolsCallposts JSON-RPC and decodes either application/json or text/event-stream responses so it works against both Playwright MCP's single-shot and streamed tool-call paths; 5 client unit tests cover init+session-id, RPC errors, tool-levelisError, SSE framing, and upstream 5xx. Theweb.testpack itself is ininternal/packs/builtin/webtest.go(renamed fromweb_test.go— the_test.gosuffix is reserved by Go for test files, which silently excludes the production code from the build). Input takes{url, instruction, model, max_steps?, assertions?}; handler flow is (1) validate + egress-check target URL, (2) readPlaywrightMCPEndpointfrom the session populated by T807a (refuses with a clear CodeSessionUnavailable pointing operators at T807a if empty), (3)initializethe MCP session, (4) seedbrowser_navigate+browser_snapshotso the model's first turn sees the page without wasting a step on deterministic work, (5) plan-step loop: ask the gateway LLM for one tool call as JSON given (goal, current snapshot, compact history), parse it via a balanced-brace scanner tolerant of prose and markdown code fences, guard any mid-testbrowser_navigatethrough the egress check so the model can't pivot to metadata, execute via pwmcp, re-snapshot, repeat untildone,fail, or max_steps. Optional assertions run a substring match against the final snapshot; completed=false if either the model didn't say done or any assertion failed. Registered conditionally incmd/control-plane/main.goalongside the vision packs — only when a gateway dispatcher is configured. 13 table-driven tests cover happy path w/ assertions, missing session, empty PWMCP endpoint, missing fields (url/instruction/model), egress-blocks-target, egress-blocks-mid-test (verifies the blocked call is NOT forwarded to MCP), initialize failure, model emits fail with reason, max_steps exhausted, assertion-failed final report, unparseable JSON, prose/markdown-wrapped JSON tolerance, and unknown tool name.browser.interact(T621) stays in the catalog as the deterministic LLM-free option when the caller already knows the refs.) - T622
research.deep— Firecrawl-backed deep research: search a topic across multiple sources, scrape each to clean markdown, return a synthesis. Composite pack chaining Firecrawl search + scrape APIs. Depends on T807b. (handler takes{query, model, limit?, max_tokens?}, fans out a single POST to/v1/searchwithscrapeOptions.formats=["markdown"]so search + per-source scrape happen in ONE upstream round trip — self-hosted Firecrawl's default Google backend handles the search with no extra config, SearXNG is supported viaSEARXNG_ENDPOINTon the Firecrawl container. Limit defaults to 5 and hard-caps at 10 because larger values blow up synthesis token usage linearly. The frozen synthesis prompt instructs the model to cite URLs inline in parentheses and stay factual; user message isQUERY:+SOURCES:blocks (one--- Source N: URL ---header per item). Empty-markdown items are dropped before synthesis and the whole call fails withhandler_failedif nothing usable survives. Shared Firecrawl HTTP helpercallFirecrawlSearchlanded in the same file so T623content.groundcan reuse it — single place to tune timeouts, response caps (16 MiB), and error shaping. Registered conditionally in main.go alongside the vision packs and web.test (needs a gateway dispatcher); env-gated onHELMDECK_FIRECRAWL_ENABLED. 10 table-driven tests cover happy path (synthesis receives both source bodies + URL headers), disabled-by-default, missing query, missing model, limit cap, Firecrawl 500,success=false, all-empty markdown → handler_failed, synthesis dispatch failure (quota error propagation), and whitespace-only synthesis response.) - T623
content.ground— link grounding for blog posts: parse a markdown file for claims, search GitHub + web for authoritative sources, insert real[source](url)links directly in the file. (ADR 035) (session-scoped pack that takes{clone_path, path, model, max_claims?, topic?}and runs a two-phase pipeline against a markdown file inside a session-local clone. Phase 1: read the file via the same session-executor wc+cat pattern fs.patch uses, then ask the gateway LLM for a strict-JSON claim plan — the prompt requirestextto be a VERBATIM substring of the post so the literal substring match in phase 2 works deterministically. Phase 2: for each claim, call Firecrawl/v1/search(via the sharedcallFirecrawlSearchhelper T622 established) without scrapeOptions — grounding only needs the URL, not the body — pick the first result with a non-empty URL, and rewrite the markdown withstrings.Replace(..., count=1)so only the first occurrence is annotated. Claims whose text doesn't literally appear in the file (hallucination) are skipped in the skipped[] report rather than corrupting the file. Claims with no source found are also skipped. Write-back only fires when the patched text actually differs — otherwise the file's mtime stays clean. Milestone originally described the pipeline asgithub.search+http.fetch+web.scrape+fs.patchbut collapsing to one Firecrawl /v1/search covers "GitHub + web" in a single call (Google indexes GitHub repos/docs/issues); thetopicinput hint lets callers bias the extractor's generated queries towardsite:github.cometc. without a second integration. Env-gated onHELMDECK_FIRECRAWL_ENABLED, registered conditionally on gateway dispatcher availability in main.go. 10 top-level tests (12 subtests) cover happy path with two claims both grounded and write-back asserted, empty claim plan → no file touch, hallucinated substring skipped while a good claim in the same batch still grounds, no source found → no file touch, disabled env var, missing executor → session_unavailable, missing required fields (clone_path/path/model subcases), empty file rejection, unparseable claim JSON, and max_claims input cap (5) overriding a 10-claim model response.) - T406
slides.narrate— moved from Phase 4 (originally deferred asslides.video). Narrated MP4 video from Marp slide decks. Pipeline: parse speaker notes from<!-- -->comments → export per-slide PNGs viamarp --images→ ElevenLabs TTS per slide (API key from vaultelevenlabs-key, voice randomly picked from top 5 when not specified) → ffmpeg per-slide segments with timed audio alignment → concatenate with optional fade transitions → LLM-generated YouTube metadata (title, description withM:SStimestamps, tags, category). Degrades gracefully: no vault key → silent video, no metadata_model → skip metadata. 23 tests (parser: 13, handler: 10). (2026-04-18: bumpedSessionSpec.MemoryLimitto 2g after OOM on segment 4 at 1080p — ffmpeg ~1.1 GB + Chromium baseline ~700 MB overflowed the 1 GiB default. Encoding is serial so memory is resolution-bound, not slide-count-bound; see the Resource scaling block inskills/helmdeck/SKILL.md.) - T622a
repo.fetchcontext envelope +repo.map—repo.fetchnow returnstree,readme,entrypoints,doc_hints, andsignalsalongside the existingclone_path/commit/filesso agents orient on the first turn without chainingfs.list+fs.read. Closes the 2026-04-14 OpenClaw "empty repo" false-positive onlow-latency-performance-workshop(README.adoc wasn't auto-detected; the envelope's glob match surfaces it). Companion opt-in packrepo.map(pack #36) produces an Aider-style structural symbol map (ctags + python3 reducer inside the sidecar) under a token budget, ranked by symbol density with junk-file excludes (lockfiles, node_modules, minified bundles) and alanguagesfilter that maps common globs (*.go,*.py,*.ts, ...) to ctags--languages=names. ADR 022 §2026-04-15 revision + new ADR 036. 15 new tests (envelope: 5, repo.map: 10 incl. 4 live-integration).
Milestone: v1.0 — Kubernetes & GA (Phase 7)
Status: Phase 6.5 complete (v0.8.0 tagged). 36 packs ship. All pre-GA feature work is done. Phase 7 is the production-readiness push: Kubernetes deployment, Helm chart, scaling, TLS, external secrets, load testing, security audit.
Milestone: v1.x — Innovation Backlog (Phase 8)
Target: Post-GA · no fixed week
- T801 WASM Executor + WASI capability inspection
- T802 Four-tier Memory API (Working/Episodic/Semantic/Procedural)
- T803 Procedural→Pack promotion UI
- T804 WebRTC live session streaming
- T805 Audio capture for desktop sessions
- T806 WebMCP detection + preferential routing
-
T807 Pre-packaged Chrome DevTools MCP / Playwright MCP entries— completed by T807a (Phase 6.5). Playwright MCP bundled in the sidecar Dockerfile, auto-registered on session start. Chrome DevTools MCP is redundant — chromedp-based packs already drive CDP directly. - T808 Firecracker isolation tier productionization
- T809 Lightpanda alternate browser engine evaluation
- T810 Pack marketplace registry model —
index.yamlcatalog,helmdeck-pack.yamlmanifest schema, cosign trust verification,HELMDECK_MARKETPLACE_URLconfig (ADR 034) - T811
commandhandler type — subprocess packs in any language (stdin JSON / stdout JSON protocol) with egress guard + audit - T812
helmdeck pack install/uninstallCLI commands +POST /api/v1/marketplace/installendpoint - T813 Marketplace UI panel —
/marketplaceroute with browse-by-category, search, pack detail, install/uninstall, trust badges - T814 Community marketplace repo (
tosin2013/helmdeck-marketplace) with initial pack catalog + contribution guide - T815 Pack ratings + install counts (requires marketplace-web frontend)
- T816 MCP Server Hosting framework — generic
helmdeck mcp install <server>for community MCP servers with sandboxed execution; converges with ADR 034 marketplace (ADR 035)
Bulk-create script
#!/bin/bash
# scripts/bootstrap-issues.sh — run once after gh auth login
set -euo pipefail
REPO="tosin2013/helmdeck"
declare -a MILESTONES=(
"v0.1 — Core Infrastructure"
"v0.2 — AI Gateway & Pack Substrate"
"v0.3 — MCP Bridge & Client Integrations"
"v0.4 — Desktop & Vision"
"v0.5 — Vault, Repo Packs & Hardening"
"v0.6 — Management UI"
"v1.0 — Kubernetes & GA"
"v1.x — Innovation Backlog"
)
for m in "${MILESTONES[@]}"; do
gh api "repos/$REPO/milestones" -f title="$m" || true
done
# Then parse this MILESTONES.md and gh issue create per checkbox.
# Each issue body should link to docs/TASKS.md#<task-id> and docs/adrs/<n>-*.md
Labels
Apply consistently: phase/1..phase/8, priority/P0..priority/P3, area/control-plane, area/sidecar, area/ui, area/helm, area/bridge, area/packs, area/vault, area/security, area/observability, kind/feature, kind/test, kind/docs.