Supported Data Types
📘 LogsConfiguring the Sawmills Lookup Processor for Logs
The Sawmills Lookup Processor enriches log records with data from a CSV file by matching a log attribute value against a CSV column and adding all other CSV columns to the log record. This is useful for adding contextual information, such as service metadata or environment details, to your logs.Configuration Components
1. Name
- Description: Identifier for your processor. Use a unique and descriptive name to differentiate between multiple processors.
2. Attribute Filters
-
Conditions: Specify conditions to filter events before processing. Events must satisfy all conditions (AND) or at least one (OR) based on the selected logic.
Each condition follows this sequence:
-
Choose the condition type:
- Log Level (Severity)
- Body as String
-
Select a comparison operator:
- Equals / Not Equals
-
Provide a value:
- Log Level: Select from a dropdown (INFO, WARN, ERROR, etc.).
- Body as String: Enter a free-text value for matching.
-
Choose the condition type:
3. Source Type
- Shared Path: Absolute path to a CSV file accessible by the collector. The file must be readable by the collector process.
- Example:
/etc/sawmills/lookup.csv - Requirements: Must be an absolute path and end with
.csvextension.
- Example:
4. Lookup Key
- Log Attribute: Select the log attribute whose value will be used to lookup matching rows in the CSV file. Supports nested attribute paths using dot notation (e.g.,
sawmills.service). - CSV Column: The CSV header name used as the lookup key. Must match the file header exactly. All CSV columns except this lookup column will be added to the log record.
5. Enriched Fields
- Target Scope: Where the enriched fields will be written.
- Resources: Adds fields to resource attributes
- Attributes: Adds fields to log record attributes
- Body: Adds fields to the log body. Fields are only written if the body is already a structured object. Plain-text bodies are left unchanged.
- Prefix: Optional namespace added in front of all enriched fields. For example, a prefix of
enrichmentwill write fields asenrichment.<column_name>. Trailing dots are automatically removed. Slashes (/) and backslashes (\, except for escaping dots) are not accepted.
6. Field Conflicts
- On Conflict: Controls what happens if a target field already exists on the record.
- Skip existing fields: Leaves existing fields unchanged and skips enrichment for those fields.
- Override existing fields: Replaces existing field values with values from the CSV file.
Dot Notation for Nested Attributes
The Lookup Processor supports dot notation for accessing nested attributes in both lookup keys and enrichment prefixes. This allows you to work with structured data where attributes are organized in nested maps.Lookup Key with Dot Notation
When specifying a lookup key with dot notation, the processor traverses nested maps to find the value. For example, if you specifysawmills.service as the lookup key:
- The processor looks for
attributes["sawmills"](which must be a map). - Then looks for
attributes["sawmills"]["service"]to get the value. - This value is used to match against the CSV lookup column.
sawmills.service = "foo", the processor will:
- Access
attributes["sawmills"](a map). - Read
attributes["sawmills"]["service"]which equals"foo". - Use
"foo"to lookup matching rows in the CSV file.
Enrichment Prefix with Dot Notation
When specifying an enrichment prefix with dot notation, the processor creates nested maps to organize the enriched fields. For example, if you specifyenrichment.bar as the prefix:
- The processor creates
attributes["enrichment"](a map) if it doesn’t exist - Then creates
attributes["enrichment"]["bar"](a map) if it doesn’t exist - All enriched key-value pairs from the CSV are inserted under
attributes["enrichment"]["bar"]
enrichment.bar and CSV columns region and environment, the enriched data will be structured as:
attributes["enrichment"]["bar"]["region"]= CSV value for regionattributes["enrichment"]["bar"]["environment"]= CSV value for environment
Escaping Dots
If you need to reference a flat attribute key that contains a literal dot (not a nested path), escape it with a backslash:sawmills\.service will look for attributes["sawmills.service"] as a single key rather than traversing nested maps.
Processor Operations
The Lookup Processor operates in the following sequence for each log record:- Condition Evaluation: If conditions are configured, the processor evaluates them. Logs that don’t match the conditions are skipped.
- Lookup Key Extraction: The processor extracts the lookup key value from the specified log attribute (or resource attribute/body) using the configured lookup key path.
- CSV Lookup: The extracted lookup key value is used to search the CSV file’s lookup column for a matching row.
- Enrichment: If a match is found, all CSV columns (except the lookup column) are added to the log record at the specified target scope with the optional prefix.
- Conflict Handling: If target fields already exist, the processor applies the configured conflict resolution strategy (skip or overwrite).
CSV Structure Requirements and Limitations
Requirements
- Header Row: The CSV file must have a header row as the first line containing column names.
- Lookup Key Column: One column must match the configured CSV column name exactly (case-sensitive).
- Column Count Consistency: All data rows must have the same number of columns as the header row.
- File Format: The file must be a valid CSV file with proper formatting.
Limitations
- Maximum Columns: 50 columns per CSV file
- Maximum Rows: 5,000 rows per CSV file
- Duplicate Keys: If multiple rows have the same lookup key value, the first matching row is used (subsequent duplicates are ignored).
- Empty Values: Empty CSV cell values are allowed but are skipped during enrichment (not added to log records).
CSV File Example
service_nameis the lookup key column.region,environment, andownerwill be added as enrichment attributes.- The file has 4 columns and 3 data rows (within limits).
Edge Cases and Processor Behavior
Lookup Key Not Found in Log
Scenario: The log record doesn’t contain the specified lookup key attribute. Behavior: The processor skips enrichment for that log record and continues processing. No error is raised, and the log passes through unchanged.CSV Lookup Key Not Found
Scenario: The lookup key value from the log doesn’t match any row in the CSV file. Behavior: The processor skips enrichment for that log record and continues processing. No error is raised, and the log passes through unchanged.Empty CSV Values
Scenario: A CSV cell contains an empty value. Behavior: Empty values are skipped during enrichment. The attribute is not added to the log record.Duplicate Lookup Keys in CSV
Scenario: Multiple rows in the CSV file have the same lookup key value. Behavior: The first matching row is used. Subsequent rows with the same key are ignored.Body Target with Non-Map Body
Scenario: Enrichment target is set to “Body” but the log body is plain text (not a structured object). Behavior: The processor skips enrichment for that log record. Enrichment only works when the body is already a structured map/object.Prefix Path Conflicts
Scenario: The enrichment prefix path conflicts with an existing non-map value (e.g.,enrichment exists as a string but prefix is enrichment.bar).
Behavior: Depends on the conflict resolution setting:
- Skip: Processor skips enrichment and logs a debug message
- Overwrite: Processor creates a new map at the conflicting path, replacing the existing value
Condition Evaluation Errors
Scenario: An error occurs while evaluating conditions for a log record. Behavior: The processor logs a debug message and skips that log record (does not enrich it).Target Attribute Access Errors
Scenario: The processor cannot access the specified target (attributes, resource, or body). Behavior: The processor skips enrichment for that log record, logs a debug message, and continues processing other logs.Shared Path Mount
The Lookup Processor requires access to CSV files via a shared path that is accessible by all collector pod instances. The shared path must be an absolute path on the filesystem where the collector is running.Path Requirements
- Absolute Path: Must start with
/(e.g.,/mnt/efs/lookup.csv). - File Extension: Must end with
.csvextension. - Readable: The collector process must have read permissions for the file.
- Persistent: The file should be available across collector pod restarts.
EFS Volume Setup for Sawmills Collector
This guide explains how to configure EFS (Elastic File System) volume mounts for the Sawmills Collector using the remote-operator Helm chart.Overview
The Lookup Processor requires access to lookup files stored on an EFS volume. This setup mounts the EFS filesystem to/mnt/efs in the collector pods.
Prerequisites
Before installing the Helm chart with EFS support, you must create the StorageClass and PersistentVolumeClaim in your cluster.1. Create the StorageClass
Note: Replace fileSystemId with your actual EFS filesystem ID.
2. Create the PersistentVolumeClaim
3. Verify Resources
Helm Installation
Option A: Using --set flags
Option B: Using a Values File (Recommended)
Create a values filevalues-efs.yaml:
Verification
After deployment, verify the EFS mount is working:Lookup Processor Configuration
Once the EFS volume is mounted, configure thecsvenrichmentprocessor in your collector config to use the mounted files:
Troubleshooting
PVC Not Bound
- The StorageClass exists
- The EFS CSI driver is installed
- The fileSystemId is correct
Mount Permission Denied
Ensure the EFS access point has correct permissions:directoryPerms: "0777"in StorageClass- Security groups allow NFS traffic (port 2049)