Scale Checkly Agent pods automatically in relation to live load. This page covers the KEDA-based recipe; for static capacity planning, see Scaling and Redundancy.Documentation Index
Fetch the complete documentation index at: https://checklyhq.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Prerequisites
- Prometheus V2 metrics are being ingested for your account — the only source for this gauge. See Exporting Metrics & Data via Prometheus V2.
- Checkly Agents are deployed via the Checkly agent Helm chart (or an equivalent
Deployment). See Kubernetes Deployment. - KEDA is installed in the cluster.
The signal
Checkly exposes thecheckly_private_location_check_runs gauge through the Prometheus V2 exporter. Filtered by state and a private_location_slug_name, it provides the count of pending and currently-executing check runs in a single Private Location — the signal you drive replica count from.
The relevant state values are:
queued— the check run has been scheduled but not yet picked up by an agent.inflight— the check run is currently being executed by an agent.
The gauge is aggregated on a ~1 minute interval, so checks that start and finish within that window may be excluded — their impact on Private Location capacity is negligible.
KEDA ScaledObject
The ScaledObject below provides sensible defaults — adjust the bounds and scaling behavior to match your check workload.
private_location_slug_name, so create one ScaledObject per Private Location.
For a Prometheus instance outside the cluster, add an authenticationRef pointing at a TriggerAuthentication resource with the appropriate credentials.
How many pods you’ll get
KEDA queries Prometheus on its polling interval and turns the result into a target pod count. Withthreshold: "1", that target is roughly the number of queued plus in-flight check runs — one pod per check. The pod count is then kept within minReplicaCount and maxReplicaCount.
For example, with threshold: "1", minReplicaCount: 2, maxReplicaCount: 10:
| Queued + in-flight check runs | Resulting pods |
|---|---|
| 0 | 2 (idle floor) |
| 1 | 2 |
| 3 | 3 |
| 7 | 7 |
| 20 | 10 (capped) |
Tuning the bounds
threshold— set it to match the agent’sJOB_CONCURRENCY. The defaultJOB_CONCURRENCYis1, so leavethreshold: "1". A higher value packs more checks per pod and can cause scheduling delays for long-running checks.minReplicaCount— keep at2or higher so a single agent failure doesn’t take the Private Location offline. See Scaling and Redundancy.maxReplicaCount— must exceed your expected peak queued + in-flight check runs. If the cap is too low, queued check runs accumulate above it and are dropped after the 6-minute queue TTL.
If you set
minReplicaCount: 0 to scale to zero when idle, cooldownPeriod becomes important — it controls how long KEDA waits after the trigger goes inactive before scaling the deployment down to zero.Graceful termination
In-flight checks on a terminating pod are rerun on another agent after a 300-second timeout. SetterminationGracePeriodSeconds above this on the agent pod spec so an evicted pod has room to drain before SIGKILL:
| Check type | Maximum runtime |
|---|---|
| API, TCP, DNS, ICMP | 30 seconds |
| Browser | 4 minutes |
| Multistep | 4 minutes |
| Playwright Check Suite | 60 minutes |
Verify
-
Confirm KEDA created the HPA and is reading the metric:
-
Probe the signal directly:
-
Schedule a burst of checks against the Private Location and watch the replica count climb toward
maxReplicaCount, then settle back tominReplicaCountonce the burst clears.