> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sawmills.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Capacity Planning

> Plan Kubernetes resource requirements for your Sawmills Collector deployment with recommended CPU, memory, and instance types for AWS, Azure, and GCP.

## Overview

When deploying a Sawmills Collector, it is crucial to plan for capacity to ensure optimal performance and reliability. Proper capacity planning helps prevent bottlenecks, ensures consistent data processing, and enables cost-effective scaling as your telemetry volume grows.

The Sawmills Collector is architected for horizontal scaling to handle enterprise-scale data volumes efficiently. Under default configuration with 4 CPU cores and no processors, a single collector instance can process approximately **14 million logs per day**, providing a solid foundation for most production workloads.

## Kubernetes Resource Requirements

We recommend allocating at least **4 vCPUs and 3 GiB memory per node as a kubernetes limit with a minimum (kubernetes request) of 1 vCPU and 1 GiB memory**.

### AWS Instance Types

The collector runs best on Graviton-based instances, preferably: **M7g, M6g**

We also support amd64/x86\_64 architectures (less performant), such as: **M7i, M6i, M6a**

### Azure Instance Types

For Azure deployments, we recommend the following instance types based on workload requirements:

* **D4s\_v5**: 4 vCPUs, 16 GB RAM - Ideal for moderate telemetry volumes
* **D8s\_v5**: 8 vCPUs, 32 GB RAM - Suitable for high telemetry volumes

### Google Cloud Platform (GCP) Instance Types

For GCP deployments, we recommend the following machine types:

* **n2-standard-4**: 4 vCPUs, 16 GB RAM - Ideal for moderate telemetry volumes
* **n2-standard-8**: 8 vCPUs, 32 GB RAM - Suitable for high telemetry volumes

### Collector Sizing Table

You can reference this table as a starting point for collector system requirements. The resource recommendations do not consider multiple exporters or processors.

| Telemetry Throughput | Logs / second | Collectors |
| -------------------- | ------------- | ---------- |
| 5 GiB/m              | 130,000       | 1          |
| 10 GiB/m             | 260,000       | 2          |
| 20 GiB/m             | 520,000       | 4          |
| 40 GiB/m             | 1,040,000     | 8          |
| 80 GiB/m             | 2,080,000     | 16         |
| 160 GiB/m            | 4,160,000     | 32         |
| 320 GiB/m            | 8,320,000     | 64         |
| 640 GiB/m            | 16,640,000    | 128        |
| 1,280 GiB/m          | 33,280,000    | 256        |
| 2,560 GiB/m          | 66,560,000    | 512        |

## Fault Tolerance and High Availability

**Over-provisioning is essential** for maintaining service continuity. We recommend maintaining **20-30% additional capacity** beyond your baseline requirements to ensure your collector fleet can handle:

* **Node failures**: When one or more collector instances fail unexpectedly
* **Maintenance windows**: Planned downtime for updates, patches, or infrastructure changes
* **Traffic spikes**: Sudden increases in telemetry volume during peak usage periods
* **Rolling deployments**: Zero-downtime updates that temporarily reduce available capacity

This buffer ensures that your remaining collectors can seamlessly absorb the workload without dropping data or experiencing performance degradation.

When dealing with a fixed number of collectors, you can scale their CPU and memory vertically in order to increase throughput. See collector sizing table above.
