We tested upstream's closed bug and it wasn't fixed
Hook
A v0.29.2 hyperframes pipeline produced 15 seconds of animation followed by 83 seconds of blank canvas. An upstream issue with the same symptom had been closed a month earlier with a fix shipped in 0.6.110. We bumped our pin, rebuilt the sidecar, and re-ran the same reproducer. The bug was still there. Frames at t=20s/100s/200s/300s were byte-identical to the 0.6.97 result — md5 9c95fca0…, 8.6 KB of pure blank canvas. The upstream fix addresses an adjacent code path; ours is duration-mismatch, theirs is attribute-stripping during composition inlining. We filed heygen-com/hyperframes#1540 with the reproducer.
The interesting part isn't the bug. It's the discipline of testing an upstream issue-close before trusting it — and what to do when the trust-but-verify fails.
Context
The pipeline run was run_6f6cb0ea40a94dd1 against builtin.scaffolded-narrated-video: a decision-tree-flavored hyperframes scaffold, narration generated by podcast.generate, audio attached by the new hyperframes.attach_audio pack (v0.29.2 / PR #542), rendered to MP4. The audio was ~98 seconds; the rendered video was the same length. ffprobe said h264 + AAC, both 98s. Playback said 15 seconds of animation, then white.
The first hypothesis was hours of "this is upstream's silent-audio bug (issue #521) regressing." It wasn't — attach_audio was correctly embedding the MP3 in the project tarball and rewriting the root composition's data-duration from 15 to 97.9. But the scaffold's index.html has two compositions:
<div data-composition-id="main" data-duration="15"> <!-- root -->
<div data-composition-id="decision-tree"
data-composition-src="compositions/decision_tree.html"
data-duration="15"></div> <!-- child -->
</div>
We extended the root to 97.9. We left the child at 15. The renderer played 0–15 seconds of decision-tree animation, then the child's slot expired while the root kept going.
Searching upstream's tracker turned up heygen-com/hyperframes#911, title: "Sub-composition slot goes black after GSAP timeline ends, regardless of host data-duration". The exact symptom. Closed 2026-05-17 with a fix shipped to the runtime. Our pinned 0.6.97 was published 2026-06-13 — but the fix was supposed to be in the patch series that followed. The latest at the time of writing is 0.6.110. We landed the helmdeck-side workaround in PR #546 anyway (rewrites the child's data-duration to match the root's whenever they started equal) and lined up a separate pin bump to bring in the upstream fix too. Defense-in-depth.
Finding
Before pushing the pin bump, we ran the same controlled probe on the new image: build sidecar with hyperframes@0.6.110, copy the same broken-state scaffold (root=331, child=15) into the container, render, sample frames at the same timestamps, md5 them.
| Time | 0.6.97 broken | 0.6.110 broken |
|---|---|---|
| t=7s, 14s | fc3407… 20K (grid) | fc3407… 20K (grid) |
| t=20s | 9c95fc… 8.6K (blank) | 9c95fc… 8.6K (blank) |
| t=100s, 200s, 300s | 9c95fc… 8.6K (blank) | 9c95fc… 8.6K (blank) |
Byte-identical. The bump didn't fix it.
Inspecting the shipped runtime bundle (dist/hyperframe.runtime.iife.js inside /usr/lib/node_modules/hyperframes) confirmed the #911 fix IS shipped:
d.hasAttribute("data-composition-src")||d.hasAttribute("data-composition-file")
The check is there. The fix isn't broken — it just addresses a different code path than ours. Upstream #911's root cause was the producer's htmlCompiler stripping data-composition-src during composition inlining; the runtime fix lets the renderer recognize the inlined-composition marker so the slot stays alive. Our reproducer never goes through the producer (we hand-edit the scaffold's index.html), so data-composition-src is present throughout. The trigger in our case is the duration mismatch itself, not a missing attribute.
We filed heygen-com/hyperframes#1540 with the reproducer above, explicitly framed as "adjacent to but distinct from #911."
Why this matters to you
Two takeaways generalize beyond hyperframes.
The first is the "is this our bug or theirs" method. When you can't tell from logs whether a friction comes from your code or the upstream you wrap, the cheapest test is: strip out all your code, reproduce upstream's known-default state, perturb the one variable you suspect, and compare. We used a 331-second MP3 because it's far longer than the scaffold's intrinsic 15-second timeline — long enough that an off-by-one or "renderer holds the last frame for a second or two" wouldn't false-positive into looking like the bug. The md5 comparison turns subjective "the canvas looks white-ish" into binary "are these two frames the same bytes." A 1-line shell loop replaces an hour of squinting. The same method is what surfaced "0.6.110 doesn't fix this either" in 10 minutes instead of two days of theorizing.
The second is what to do when "trust upstream's close" fails. Conventional wisdom — and our own feedback-upstream-cli-takes-precedence engineering memory — says: don't shim around upstream bugs; report them, wait for the fix, bump the pin. Usually right. But "closed" doesn't always mean "fixed for your exact case." Upstream #911 was closed in good faith; the fix landed; the bundle ships it. It just doesn't catch every code path that produces the same symptom. The right move when verify-fails is the three-part move we landed:
- Keep the workaround. PR #546 was already merged; it stays. The operator-visible bug is closed today, not at some future date.
- Bump the pin anyway — version hygiene, 13 patch releases of unrelated upstream improvements, ADR-037 sentinel discipline intact. Framed honestly in the CHANGELOG as "does NOT fix the slot-lifetime bug; the workaround in PR #546 is what closes it for operators."
- File the upstream issue with a clean reproducer, distinct from the closed one and explicit about what differs. That's the open-source contribution: the next person debugging this same symptom finds a reproducer that runs in five minutes, not a hand-wavy thread.
That's the discipline that survives outside this codebase. "We tested" beats "they closed it." When the test disagrees with the close, file an issue with the test attached.
The thing not to do is shim instead of reporting. The shim alone leaves the next operator hitting the same symptom from a slightly different angle and finding only your workaround, not the underlying bug. The upstream issue closes the loop. Even if upstream doesn't accept it, doesn't fix it, doesn't reply for six months — the public record exists and is the canonical reference.
See also
- The helmdeck workaround (the actual fix for operators today): PR #546
- The upstream issue we filed:
heygen-com/hyperframes#1540 - The closed-but-adjacent upstream issue:
heygen-com/hyperframes#911 - helmdeck-side watch issue (where the shim comes back out once upstream is fixed): #547
- Pack reference:
hyperframes.attach_audio - ADR 037 (the pin discipline): Upstream package version management
- Earlier hyperframes friction story: Pinning the wrong package
