> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sawmills.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Autoscale Sawmills Collector with KEDA

> Configure KEDA event-driven autoscaling for the Sawmills Collector in Kubernetes using Prometheus, OTLP external scaler, CPU, or memory metrics.

This guide explains how to configure [KEDA](https://keda.sh/) (Kubernetes Event-driven Autoscaling) to automatically scale your Sawmills Collector based on real-time metrics. KEDA enables dynamic, event-driven scaling, ensuring your collector adapts to changing workloads efficiently.

## Prerequisites

Before you begin, ensure you have:

* The Sawmills Collector deployed and running in the `sawmills` namespace
* KEDA installed in your Kubernetes cluster ([KEDA installation guide](https://keda.sh/docs/latest/deploy/))
* (Optional) Prometheus deployed if you want to use Prometheus-based scaling

## Enabling KEDA Autoscaling for Sawmills

KEDA can be enabled and configured via the Sawmills Collector Helm chart. For remote-operator installs, the configuration is managed under the `managedChartsValues.sawmills-collector.keda` section in your `values.yaml` file. The legacy `managedCharts` key still works as an alias.

### Step 1: Enable KEDA in Helm Values

Add or update the following section in your `values.yaml` of the remote operator:

```yaml theme={null}
managedChartsValues:
  sawmills-collector:
    keda:
      enabled: true
      minReplicas: 1
      maxReplicas: 10
      pollingInterval: 30 # How often KEDA checks metrics (seconds)
      cooldownPeriod: 300 # Wait time after last scaling before scaling down (seconds)
      scaling:
        prometheus:
          enabled: false # Set true to use Prometheus-based scaling
          metadata:
            serverAddress: http://prometheus:9090
            query: "histogram_quantile(0.95, sum(rate(http_server_duration_bucket[1m])) by (le))"
            threshold: "2000"
            activationThreshold: "1000"
        external:
          enabled: false # Set true to use OTLP-based external scaler (the Sawmills based solution)
          metadata:
            query: "histogram_quantile(0.95, sum(rate(http_server_duration_bucket[1m])) by (le))"
            targetValue: "2000"
        cpu:
          enabled: true
          targetUtilization: 80
        memory:
          enabled: true
          targetUtilization: 80
```

* **minReplicas**/**maxReplicas**: Minimum and maximum number of collector pods.
* **pollingInterval**: How often KEDA checks metrics for scaling decisions.
* **cooldownPeriod**: How long to wait after scaling before scaling down.
* **scaling**: Configure one or more scaling sources (Prometheus, external, CPU, memory).

### Step 2: Apply the Configuration

After updating your `values.yaml`, upgrade your Sawmills Collector release:

```bash theme={null}
helm upgrade --install sawmills-remote-operator \
  oci://public.ecr.aws/s7a5m1b4/sawmills-remote-operator-chart \
  --version <latest-version> \
  --namespace sawmills \
  --set apiKeyExistingSecret=sawmills-secret \
  --set operatorAddress=https://controller.ue1.prod.plat.sm-svc.com \
  --set collectorName="my-collector" \
  -f values.yaml
```

## KEDA Scaling Options

You can configure KEDA to scale based on different metrics sources:

### 1. Prometheus-based Scaling

Enable `scaling.prometheus.enabled: true` and provide the Prometheus server address and PromQL query. Example:

```yaml theme={null}
scaling:
  prometheus:
    enabled: true
    metadata:
      serverAddress: http://prometheus:9090
      query: "sum(rate(otelcol_receiver_accepted_spans[1m]))"
      threshold: "1000"
      activationThreshold: "500"
```

### 2. External Scaler (OTLP) Scaling

Enable `scaling.external.enabled: true` to use the KEDA OTLP external scaler. Example:

```yaml theme={null}
scaling:
  external:
    enabled: true
    metadata:
      query: "sum(rate(otelcol_receiver_accepted_spans[1m]))"
      targetValue: "1000"
```

### 3. CPU/Memory-based Scaling

Enable `scaling.cpu.enabled` and/or `scaling.memory.enabled` to scale based on resource utilization:

```yaml theme={null}
scaling:
  cpu:
    enabled: true
    targetUtilization: 80
  memory:
    enabled: true
    targetUtilization: 80
```

## Example: Full KEDA Configuration

Here is a complete example for enabling KEDA with multiple scaling sources:

```yaml theme={null}
managedChartsValues:
  sawmills-collector:
    keda:
      enabled: true
      minReplicas: 2
      maxReplicas: 20
      pollingInterval: 30
      cooldownPeriod: 300
      scaling:
        prometheus:
          enabled: true
          metadata:
            serverAddress: http://prometheus:9090
            query: "sum(rate(otelcol_receiver_accepted_spans[1m]))"
            threshold: "2000"
            activationThreshold: "1000"
        external:
          enabled: false
          metadata: {}
        cpu:
          enabled: true
          targetUtilization: 75
        memory:
          enabled: false
```

## KEDA Scaler Component for OpenTelemetry

The Sawmills Helm chart includes an optional KEDA scaler component for advanced OTLP-based scaling. To enable it, set:

```yaml theme={null}
managedChartsValues:
  sawmills-collector:
    kedaScaler:
      enabled: true
      # Additional configuration as needed
```

Refer to the Helm chart documentation for advanced configuration options.

## Verifying KEDA Autoscaling

1. **Check KEDA ScaledObject**
   ```bash theme={null}
   kubectl get scaledobject -n sawmills
   # Should show a ScaledObject for your collector
   ```
2. **Check Collector Pod Scaling**
   ```bash theme={null}
   kubectl get pods -n sawmills -l app.kubernetes.io/instance=<collector-release-name>
   # Observe the number of pods scaling up/down based on load
   ```
3. **Check KEDA Operator Logs**
   ```bash theme={null}
   kubectl logs -n keda -l app=keda-operator
   # Look for scaling decisions and errors
   ```

## Troubleshooting KEDA Autoscaling

* **ScaledObject Not Created**: Ensure `keda.enabled: true` and Helm upgrade completed successfully.
* **Pods Not Scaling**: Check metric queries, thresholds, and KEDA operator logs for errors.
* **Prometheus/External Metrics Not Detected**: Verify Prometheus/external scaler endpoints are reachable and queries return expected results.
* **Resource-based Scaling Not Working**: Ensure CPU/memory requests and limits are set on the collector pods.
* **Existing HPA Still Owns Scaling**: If you enabled KEDA on an existing collector, check for an old HPA targeting the same collector workload. KEDA creates and manages its own HPA through the ScaledObject.

## Best Practices for KEDA with Sawmills

* Start with conservative scaling thresholds and adjust based on observed workloads.
* Use multiple scaling sources (e.g., CPU and Prometheus) for robust autoscaling.
* Monitor KEDA and collector logs for scaling anomalies.
* Test scaling behavior under simulated load before deploying to production.
* For advanced scenarios, leverage the external scaler for custom OTLP metrics.

## References

* [KEDA Documentation](https://keda.sh/docs/)
* [Sawmills Collector Helm Chart](https://github.com/Sawmills/helm-charts)
* [KEDA Scalers Reference](https://keda.sh/docs/latest/scalers/)

For further assistance with KEDA configuration for Sawmills, contact the Sawmills team or consult the [Collector Customization Guide](./collector-customization).
