Why is it useful? Who would benefit from it? How should it work?
CURRENT SITUATION:
------------------
In Instana , the Logging Feature operates in
an all-or-nothing manner:
• All agents continuously send logs to the backend regardless of the feature's
backend status
• When the feature is disabled at the backend, logs are discarded after
transmission
• When enabled, all logs from all agents are stored in ClickHouse
• There is no mechanism to selectively control which agents send logs
PROBLEM:
--------
For large-scale deployments (7,500+ agents), enabling the Logging Feature
globally creates significant concerns for both On-Premise and SaaS customers:
FOR ON-PREMISE CUSTOMERS:
• Massive data volume and storage requirements
• Potential backend overload from simultaneous log ingestion
• Unnecessary network bandwidth consumption
• Infrastructure costs for storing logs from non-critical systems
FOR SAAS CUSTOMERS:
• Risk of exceeding Fair Usage Policy limits due to uncontrolled log ingestion
• Increased costs from data ingestion charges
• No ability to manage data volume to stay within contracted limits
FOR ALL CUSTOMERS:
• No ability to pilot the feature with a subset of agents
• Cannot selectively enable logging for critical systems only
• All-or-nothing approach prevents gradual adoption
WHO WOULD BENEFIT:
------------------
• Instana SaaS Customers - Reduce data ingestion to stay within Fair Usage
Policy limits
• On-Premise customers with large Instana deployments - Prevent backend
overload and manage infrastructure costs
• Organizations wanting to implement phased rollouts of new features
• Teams needing to prioritize logging for critical applications (e.g.,
production environments) while excluding development/test systems
• Cost-conscious customers managing storage and bandwidth expenses
PROPOSED SOLUTION:
------------------
Implement a control mechanism for the Logging Feature:
1. AGENT-LEVEL CONTROL :
• Configuration option in agent configuration file to enable/disable
"extended logging"
• When disabled, agent does not collect or transmit logs to backend
• When enabled, agent behavior follows current implementation
• Default: disabled (opt-in model for backward compatibility)
• Support for dynamic configuration updates without agent restart
2. SENSOR-LEVEL CONTROL (IDEAL - Future Enhancement):
• Granular control at individual sensor level (application or platform
level like Kubernetes)
• Enable/disable logging per sensor type or specific application
• Same configuration options as agent-level control
• Allows fine-tuning for specific technologies or workloads
• Example: Enable logging only for critical Java applications while
disabling for test environments
EXAMPLE AGENT CONFIGURATION:
----------------------------
com.instana.plugin.logging:
enabled: true # Enable extended logging for this agent
IMPLEMENTATION BENEFITS:
------------------------
• Customers can pilot logging with specific agent groups (e.g., production, test
specific Application agents only)
• Gradual rollout reduces risk of backend overload
• Reduced network traffic and storage costs
• Better alignment with customer's operational requirements
• Maintains existing sensor-level controls as additional fine-tuning option
NOTE: While some sensors currently support logging/tracing controls, this is
poorly documented, inconsistent across sensors, and may inadvertently disable
trace information that customers rely on independently of the Logging Feature.