Skip to main content
Setting up fallback endpoints for your sources ensures that no logs are lost when your collector becomes unavailable or unhealthy. This is achieved by configuring a fallback endpoint that will receive your logs when the primary collector is not operational.

How It Works

When you configure a fallback endpoint, the following components are automatically set up:
  1. An HAProxy sidecar container is deployed alongside your collector pod
  2. The HAProxy continuously monitors the health of your collector through periodic health checks and observing any response errors from the collector
  3. If the collector is deemed unhealthy, HAProxy automatically routes all incoming data to the fallback endpoint
  4. Once the collector is healthy again, HAProxy resumes sending data to the primary collector
This automatic failover mechanism ensures continuous log collection without any data loss.

Configuration

Fallback endpoints are configured at the source level, allowing you to specify different fallback endpoints for different sources based on your requirements.

Setting Up a Fallback Endpoint

  1. Navigate to your pipeline in the UI
  2. Choose the source you want to configure
  3. Open the “Advanced Settings” section
  4. Choose the Fallback Endpoint URL for this source
When configuring a custom fallback endpoint (for example, a Loki source), the endpoint URL should be with the protocol and without the path part (e.g. http://loki-gateway.default.svc.cluster.local).

Configuring Custom Values

For some sources, such as DataDog, it’s possible to control some of the HAProxy configurations such as timeouts. Those advanced configurations can be setup via the values your helm chart. To modify values, update the managedCharts.sawmills-collector.haproxy section in your values.yaml file and then upgrade the remote operator.
managedCharts:
  sawmills-collector:
    haproxy:
      fallback_config:
        server:
          fall: 4
          interval: null
          rise: null
        timeout:
          client: null
          connect: 5000
          server: null
  • Connect Timeout
    • Description: The maximum time to wait for a connection attempt for the collector and fallback (in milliseconds, default: 5000).
    • Type: number
  • Server Timeout
    • Description: The maximum inactivity time on the server side for the collector and fallback endpoint (in milliseconds, default: 5000).
    • Type: number
  • Client Timeout
    • Description: The maximum inactivity time on the client side for the collector and fallback endpoint (in milliseconds, default: 5000).
    • Type: number
  • Server Interval
    • Description: The interval between health checks of the collector (in milliseconds, default: 2000).
    • Type: number
  • Server Rise
    • Description: The number of successful health checks to the collector before marking it up (default: 10).
    • Type: number
  • Server Fall
    • Description: The number of failed health checks to the collector before marking it down (default: 1).
    • Type: number

Benefits

  • Zero Data Loss: Ensures all logs are captured even when the primary collector is down
  • Automatic Failover: Seamless switching between primary and fallback endpoints
  • Per-Source Configuration: Flexibility to configure different fallback endpoints for different sources
Fallback endpoints are currently only available for DataDog and Grafana Loki sources, if you would like to add a different Source, please contact Sawmills support