10. KEDA Autoscaling on Custom Session Metrics

Status: Accepted Date: 2026-04-07 Domain: distributed-systems

Context

Standard CPU/memory HPA does not correlate with browser session demand: a session can be idle (low CPU) while holding a critical authenticated state, or thrashing (high CPU) on a JavaScript-heavy page. Scaling on hardware metrics produces both starvation and waste (PRD §20.6).

Decision

Use KEDA with a Prometheus scaler reading two custom metrics exported by the Go control plane at /metrics:

baas_queued_session_requests — scale up when any request queues.
baas_active_sessions / baas_pool_capacity — scale up when pool exceeds 80% utilization.

A separate browser-pool-warmup Deployment maintains a pool of pre-initialized session containers that the control plane claims on demand, eliminating the 2–4 s Chromium cold start from the first pack call's critical path.

Consequences

Positive: scaling tracks real session demand; warm pool removes cold-start latency; metrics are also visible to operators on the dashboard. Negative: KEDA is a required cluster add-on for the K8s tier; warm pool consumes baseline resources even when idle.

§20.6 Autoscaling with KEDA, §8.2 Dashboard, §18 Success Metrics

Context​

Decision​

Consequences​

Related PRD Sections​

Context

Decision

Consequences

Related PRD Sections