Audit logging support is a new capability available in Apache Pinot. This is available in StarTree Cloud starting 0.11.1 release (Release notes)
Overview
Apache Pinot provides a comprehensive audit logging framework that captures API requests across different components (Controller, Broker). The audit system logs structured JSON events containing request details, user information, and payload data for security monitoring and compliance purposes.Configuration Properties
All audit configurations follow the pattern:pinot.audit.{component}.{property}
Where component can be:
- controller - For Pinot Controller instances
- broker - For Pinot Broker instances
1. enabled
Type: BooleanDefault: false
Description: Master switch to enable/disable audit logging for the component.
Example:
2. capture.request.payload.enabled
Type: BooleanDefault: false
Description: Enables capturing of request body payloads in audit logs. When enabled, POST/PUT request bodies will be logged (subject to size limits).
Example
3. capture.request.headers
Type: String (comma-separated list)Default:"" (empty)
Description: Comma-separated list of HTTP headers to capture in audit logs. Headers are matched case-insensitively.
Example
4. request.payload.size.max.bytes
Type: Integer**Default:**8192 (8KB)
**Maximum:**65536 (64KB - hard limit)
Description: Maximum size of request payload to capture in bytes. Payloads exceeding this limit will be truncated with a marker.
Example
5. url.filter.exclude.patterns
Type: String (comma-separated list) Default: "" (empty) Description: Comma-separated list of URL patterns to exclude from audit logging. Useful for excluding health checks, metrics endpoints, etc. Exclusion patterns have priority over inclusion patterns. Note:- Since this is a comma-separated list, individual patterns cannot contain commas. Use multiple separate patterns instead of glob alternatives like api,v1.
- Patterns should NOT start with ’/’. The patterns are matched against the URL path without the leading slash.
6. url.filter.include.patterns
Type: String (comma-separated list) Default: "" (empty) Description: Comma-separated list of URL patterns to include in audit logging. When specified, only URLs matching these patterns will be audited (unless excluded). If not specified, all URLs are audited by default (except excluded ones). Note:- Since this is a comma-separated list, individual patterns cannot contain commas. Use multiple separate patterns instead of glob alternatives like api,v1.
- Patterns should NOT start with ’/’. The patterns are matched against the URL path without the leading slash.
- Exclusion always wins: URLs matching exclude patterns are never audited, even if they match include patterns
- Include acts as allowlist: When include patterns are defined, only matching URLs are audited
- Default behavior: Without include patterns, all non-excluded URLs are audited
7. userid.header
Type: StringDefault: "" (empty)
Description: HTTP header name containing the user identifier. Takes precedence over JWT claim if both are configured.
Example
8. userid.jwt.claim
Type: StringDefault: "" (empty)
Description: JWT claim name to extract user identity from Authorization Bearer tokens. Used when userid.header is not found.
Example
9. capture.response.enabled
Type: BooleanDefault: false
Description: Enables capturing of HTTP response information in audit logs. When enabled, response status codes, request IDs, and response duration will be logged along with request details.
Example
Audit Event Structure
Audit events are logged as structured JSON with the following fields:URL Pattern Syntax
Both include and exclude patterns support powerful pattern matching using glob and regex patterns:Glob Patterns (default)
- * - matches any characters within a path segment
- ** - matches any characters across multiple path segments
- ? - matches a single character
- [abc] - matches any character in the set
- [a-z] - matches any character in the range
- [!abc] - matches any character NOT in the set
- health - exact match for /health
- tables/* - matches /tables/myTable but not /tables/myTable/segments
- tables/** - matches /tables/myTable and /tables/myTable/segments
- schemas/* - matches /schemas/mySchema
- segments/*/metadata - matches /segments/mySegment/metadata
Regex Patterns
Prefix with regex: to use regular expressions: Important Note: Avoid using commas in regex patterns as they will be interpreted as list separators. Use character classes or alternation without commas instead. Examples:- regex:tables/[a-zA-Z0-9_]+$ - matches table names with alphanumeric characters
- regex:^health(check)?$ - matches only “health” or “healthcheck”
- regex:segments/.*\.tar\.gz$ - matches segment tar.gz files
- regex:tables/(realtime|offline)/.* - matches realtime or offline table operations
Complete Configuration Examples
Controller with Full Audit Logging
Broker with Security-Focused Configuration
Controller with Include Pattern Allowlist
Audit Metrics
The audit system includes built-in metrics collection to monitor performance and health of the audit logging infrastructure. These metrics are automatically exposed through Pinot’s metrics system and can be monitored via JMX or other metrics collection systems.Available Metrics
The audit system tracks the following metrics:- Audit processing time: Duration of audit event processing
- Audit event count: Number of audit events processed
- Audit errors: Count of errors during audit processing
- Request processing: Time spent processing requests for audit logging
Metrics Integration
Audit metrics are automatically integrated with:- Controller Metrics: When audit is enabled on Controller instances
- Broker Metrics: When audit is enabled on Broker instances
Dynamic Configuration Updates
The audit configuration supports dynamic updates through Pinot’s cluster configuration mechanism. Changes to audit settings are applied without requiring service restarts.Log Output Configuration
Audit logs are written using SLF4J to the logger named org.apache.pinot.audit. Configure your logging framework (e.g., Log4j2) to direct these logs appropriately:Best Practices
Security Considerations
- Sensitive Data: Be cautious when enabling payload capture as it may log sensitive information
- Header Selection: Only capture necessary headers to minimize exposure of authentication tokens
- Payload Size: Keep payload limits reasonable to prevent logging excessive data
- Log Storage: Ensure audit logs are stored securely with appropriate access controls
- Log Rotation: Implement proper log rotation to manage disk space
Performance Considerations
- Selective Enabling: Only enable audit logging for components that require it
- URL Filtering: Exclude high-frequency endpoints like health checks and metrics
- Payload Capture: Disable payload capture for high-throughput services if not required
- Header Filtering: Limit captured headers to only what’s necessary
Monitoring and Alerting
- Set up monitoring for audit log volume and errors
- Alert on authentication failures or unauthorized access patterns
- Regularly review audit logs for suspicious activity
- Integrate with SIEM systems for centralized security monitoring
Troubleshooting
Audit Logs Not Appearing
- Verify enabled is set to true for the component
- Check the logging configuration for org.apache.pinot.audit logger
- Ensure the service has reloaded configuration after changes
Payload Truncation
If payloads are being truncated:- Check the request.payload.size.max.bytes setting
- Remember the hard limit is 64KB (65536 bytes)
- Look for …[truncated] marker in logs
User Identity Not Captured
- Verify userid.header or userid.jwt.claim is correctly configured
- Ensure the specified header is present in requests
- For JWT claims, verify the Authorization header contains a valid Bearer token
Migration Notes
When migrating from older Pinot versions:- The configuration prefix has changed from a global pinot.audit to per-component prefixes
- Update your configuration to use pinot.audit.controller or pinot.audit.broker
- Each component now maintains its own audit configuration independently