`content.ground`

The "ground these claims with sources" pack. Caller supplies markdown — either inline as text or by reference to a file in a session clone (clone_path + path). The pack:

Asks an LLM to extract up to max_claims high-impact claims (with strict JSON schema; claims must be exact substrings of the source text).
For each claim, runs Firecrawl /v1/search and picks the first non-empty URL.
Appends [source](url) after each grounded claim, in place.
Returns the patched text (or writes back the file in clone mode).

The "claims must be exact substrings" rule is load-bearing: it prevents the model from drifting between "what was claimed" and "what got cited," which is the most common failure mode in two-context-window grounding.

This pack exists as one tool instead of an agent-orchestrated research.deep + fs.patch chain because (a) the claim text must match the source file exactly, which is fragile across two LLM context windows, (b) one file write per run reduces session-executor RPC overhead, and (c) a strict JSON schema keeps every caller consistent.

Setup prerequisite

Needs the Firecrawl overlay (same toggle as research.deep and web.scrape):

HELMDECK_FIRECRAWL_ENABLED=true

Inputs

Two input modes — supply either text (in-memory) or clone_path + path (session-file mode), not both.

Field	Type	Required	Default	Notes
`text`	`string`	one of	—	Markdown to ground inline. The patched markdown comes back in the response; nothing is written to disk. Use this when the user provides markdown in chat.
`clone_path`	`string`	one of	—	Session clone root. Required if `path` is set.
`path`	`string`	with `clone_path`	—	Relative markdown file path inside the clone (e.g. `posts/2026-quantum.md`). The pack patches it in place.
`model`	`string`	yes	—	Provider/model for claim extraction. Strict JSON-schema output; needs a tool-capable model.
`max_claims`	`number`	no	`5`	Cap on claims to ground. Hard cap at 8 (Firecrawl per-call cost).
`topic`	`string`	no	—	Hint for the claim extractor. e.g. `"quantum computing"` narrows extraction to topic-relevant claims and biases the search step.
`rewrite`	`boolean`	no	`false`	When `true`, the LLM also rewrites weak claims into stronger prose backed by the discovered source. More expensive (multiple LLM passes); use when "make this blog post more credible" is the goal.
`_session_id`	`string`	yes (file mode)	—	Required when `clone_path` is set; not used in text mode.

Outputs

Field	Type	Notes
`path`	`string`	Echo (only in file mode).
`claims_considered`	`number`	Claims the LLM extracted (≤ `max_claims`).
`claims_grounded`	`number`	Of those, how many had a source found via search.
`grounding`	`array`	`[{claim: "<exact substring>", url, title}]` for every grounded claim.
`skipped`	`array`	Claims with no usable source. The agent can decide whether to soften them or remove them.
`text`	`string`	(Text mode only.) The patched markdown.
`sha256`	`string`	Hex sha256 of the patched content.
`file_changed`	`boolean`	(File mode only.) `false` when no claims were grounded → file untouched.

Vault credentials needed

None. LLM provider key resolved through the AI Providers panel.

Use it from your agent (OpenClaw chat-UI worked example)

Prompt (sent in OpenClaw chat UI / openclaw-cli agent):

Use helmdeck__content-ground in text mode with text="WebAssembly delivers near-native performance and runs in every modern browser. Rust is the most-loved language six years running on Stack Overflow surveys.", model=openrouter/openai/gpt-oss-120b, max_claims=2, topic="web platform". Tell me how many claims were grounded vs skipped, and the URLs that backed each grounded claim.

Tool call (17 calls, no failures):

{
  "name": "helmdeck__content-ground",
  "arguments": {
    "text": "WebAssembly delivers near-native performance and runs in every modern browser. Rust is the most-loved language six years running on Stack Overflow surveys.",
    "model": "openrouter/openai/gpt-oss-120b",
    "max_claims": 2,
    "topic": "web platform"
  }
}