Scrape Endpoint
Scrape every collector pod at:- Port:
19465 - Path:
/metrics - Recommended interval:
15s
Collector Configuration
The Prometheus port is set by the Helm chart — no collector-config edits are needed. To override the default, set the value undermanagedChartsValues.sawmills-collector in your remote-operator values.yaml (see Updating Collector Values):
Metrics to Monitor
The collector emits per-signal counters with the suffix_log_records_total, _metric_points_total, or _spans_total. Where this guide shows <sig>, substitute the signal you ingest. All counters carry the _total suffix in OpenTelemetry’s Prometheus exposition.
-
Liveness
up{job="<job>"}— scrape target reachable (thejoblabel depends on your scraper config)otelcol_process_uptime_total— monotonic uptime; resets on restart
-
Resource Usage
otelcol_process_cpu_seconds_total— CPU rate per podotelcol_process_memory_rss— resident memory per podotelcol_process_runtime_heap_alloc_bytes— Go heap (use to detect leaks)
-
Ingestion (Receiver Side)
otelcol_receiver_accepted_<sig>_total— successful ingest, broken down byreceiverotelcol_receiver_refused_<sig>_total— input-side rejections; common causes: malformed data, incompatible protocol versions, rate limiting
-
Egress (Exporter Side)
otelcol_exporter_sent_<sig>_total— successful sends, broken down byexporterotelcol_exporter_send_failed_<sig>_total— downstream send errors; common causes: network issues, authentication failures, backend overloadotelcol_exporter_enqueue_failed_<sig>_total— drops at the queue boundary (data lost before it could be sent)
-
Backpressure
otelcol_exporter_queue_size/otelcol_exporter_queue_capacity— pair these for utilization ratio. Sustained high utilization indicates the collector cannot keep up with the destination.
Label Conventions
Sawmills emits processor instances with the format<type>/<uuid>. Real receiver and exporter label values look like:
receiver="otlp/collector-backend/otlp/72cfa7f8-00ae-40b4-a4eb-dc6f59932c29"exporter="datadog/ec03b655-369f-4273-b255-8ed7dd33366f"exporter="awss3/sampling-destination-1876f24b-3bdb-4fe5-93e1-fa36cb89e864"
exporter=~"datadog/.*") rather than literal equality. The otelcol_exporter_queue_size and _queue_capacity series also carry a data_type label (logs, metrics, traces) — if you have multiple signals exported, group or filter on it.
A Note on Lazy Emission
OpenTelemetry only exposes a counter after its first non-zero observation.otelcol_exporter_send_failed_<sig>_total and otelcol_exporter_enqueue_failed_<sig>_total will be entirely absent from the scrape until at least one failure has occurred — this is normal, not a misconfiguration. Alerts on these metrics evaluate as “no data” rather than zero, so use absent_over_time or OR vector(0) if you need to distinguish “healthy” from “metric missing because never failed”.
Sample Alert Rules
The following are starting points and should be adjusted based on your collector configuration, data volumes, and tolerance for refusals. Replace<job> with the actual job label your scraper assigns to the collector — open /targets in your Prometheus UI and copy the value shown on the collector rows.
Appendix: Kube-Prometheus-Stack Example
If you run kube-prometheus-stack with Prometheus Operator, this ServiceMonitor wires up scraping directly. Adjust therelease: label to match your Prometheus’s serviceMonitorSelector.
ServiceMonitor
Verification
Port-forward Prometheus and check/targets for the sawmills-collector-monitor scrape pool: