Skip to main content

image.generate

Text → image via fal.ai's synchronous fal.run endpoint. Caller supplies a prompt; the pack POSTs to fal.run, downloads the resulting PNG/JPEG bytes, and stores them in helmdeck's artifact store. Returns an image_artifact_key the agent (or downstream packs) can resolve.

Day 1 ships fal.ai because its sync API returns the generated image URL in the same POST response — no polling loop, fits within the MCP 60s JSON-RPC timeout, ~150 lines of pack code. The engine input field is reserved so a follow-up PR can add Replicate (queue+poll) without changing the schema.

Default model is fal-ai/flux/schnell — fast (1-3s), cheap ($0.003/image), photorealistic enough for podcast covers, blog hero images, and slide shields. Operators pass their own model for FLUX dev/pro/SDXL/etc.

Setup prerequisite

Add the fal.ai API key to the Vault panel:

FieldValue
Namefal-key (exact string — pack default; override with credential input)
Typeapi_key
Host patternfal.run
ValueYour fal.ai API key (f1_…)

Or set HELMDECK_FAL_KEY=... in deploy/compose/.env.local — once #142 (vault env-hydrate) lands, this auto-imports as fal-key on startup. Until then, the env var works as a last-resort fallback after the vault lookup.

Required. Without a key the pack returns invalid_input with "fal.ai key not found. Set HELMDECK_FAL_KEY in deploy/compose/.env.local...". (Same fail-loud shape as podcast.generate post-#138.)

Inputs

FieldTypeRequiredDefaultNotes
promptstringyesPlain-English description of the image.
enginestringno"fal"Closed set; day 1 only "fal".
modelstringno"fal-ai/flux/schnell"Any fal.ai model id. Common: fal-ai/flux/dev, fal-ai/flux-pro, fal-ai/fast-sdxl.
image_sizestringnomodel defaultfal.ai-specific: square_hd, portrait_4_3, landscape_16_9, etc. See the model's fal.ai page.
num_imagesnumberno11-4. Each image is a separate image_artifact_key.
seednumbernorandomFor reproducibility. fal.ai echoes the seed it used in seed_used.
credentialstringno"fal-key"Vault credential name override.

Outputs

FieldTypeNotes
image_artifact_keystringFirst (or only) generated image. image.generate/<rand>.png.
image_sizenumberBytes of the first image.
enginestringEcho. Always "fal" day 1.
model_usedstringEcho of resolved model id.
prompt_usedstringEcho.
seed_usednumberWhichever seed fal.ai actually used (echoed from the response).
image_artifact_keysarrayPresent when num_images > 1. Same order as fal.ai's images[].

Vault credentials needed

fal-key (canonical name). Falls back to HELMDECK_FAL_KEY env var if vault lookup misses. Hard-fails with missing_credential-style message when neither is set.

Use it from your agent (OpenClaw chat-UI worked example)

OpenClaw chat capture pending.

Developer reference (curl)

TOKEN=$(./bin/control-plane -mint-token dev -mint-token-scopes admin)
curl -fsS -X POST http://localhost:3000/api/v1/packs/image.generate \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"prompt": "a cat sitting on a podcast microphone, photorealistic",
"model": "fal-ai/flux/schnell",
"image_size": "square_hd"
}'

Response:

{
"image_artifact_key": "image.generate/c34d.../image-000.png",
"image_size": 287342,
"engine": "fal",
"model_used": "fal-ai/flux/schnell",
"prompt_used": "a cat sitting on a podcast microphone, photorealistic",
"seed_used": 42
}

Resolve the artifact:

curl -fsS -H "Authorization: Bearer $TOKEN" \
http://localhost:3000/api/v1/artifacts/image.generate/c34d.../image-000.png \
-o cover.png

Cost

ModelApprox. cost / imageApprox. wall time
fal-ai/flux/schnell$0.0031-3s
fal-ai/flux/dev$0.0253-8s
fal-ai/flux-pro$0.055-15s
fal-ai/fast-sdxl$0.0051-2s

Cost preview / dry_run semantics like podcast.generate (#145) aren't yet wired into image.generate — file an issue if you need them.

Errors

CodeWhen
invalid_inputEmpty prompt; unknown engine; num_images out of [1, 4]; missing fal-key.
handler_failedfal.ai returns 4xx/5xx; image download fails; response shape unexpected.
artifact_failedArtifact store rejects the upload (disk full, S3 5xx, etc.).
internalNo artifact store wired (control-plane misconfiguration).

Future engines

The engine field is closed-set ("fal" only) day 1. Adding Replicate is a community-friendly follow-up: switch on engine, factor out the fal.run logic into an internal falEngine, add a replicateEngine peer that handles the queue+poll loop. The credential lookup ladder + artifact upload paths are engine-agnostic — only the request/response shape changes.

See #71 for the original spec; #146 tracks the chained-into-podcast/slides/blog work that builds on this pack.