Skip to Main Content
Cloud Management and AIOps
Hide about this portal


This is an IBM Automation portal for Cloud Management, Technology Cost Management, Network Automation and AIOps products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).

Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.

Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Submitted
Workspace Instana
Categories Tracing
Created by Guest
Created on Apr 2, 2025

Tracing: Ensure 365-Day Retention for Critical or Viewed Traces and Calls

See this idea on ideas.ibm.com

Overview:
Our current Instana environment does not reliably persist traces and calls beyond the default 7-day retention period. According to documentation, when a trace meets specific conditions—such as being viewed  or containing critical errors—it should be persisted and callable for up to 365 days. However, we are experiencing unpredictable behavior where traces that meet these conditions are still expiring after 7 days, impacting long-term analysis and troubleshooting.

Problem Statement:
Traces and calls that are either critical (e.g., containing errors) or have been viewed should be retained for 365 days. 
Currently, even when these conditions are met, the traces and calls are not consistently persisted beyond 7 days. 

This behavior:

- Limits our ability to conduct thorough long-term error analysis.

- Prevents effective post-incident review and troubleshooting.

- Reduces confidence in the observability solution, as the historical context is lost unexpectedly.

Proposed Enhancement:

- Guaranteed 365-Day Retention:
Automatically persist and ensure accessibility of traces and calls for 365 days once they meet one of the following criteria:

- Viewed Traces: Traces or calls that have been viewed in the UI or are gathered by API  call / trace ID GET.
The size / number of calls should not be relevant for persisted / extended retention of traces here.

- Traces with generated LINK: When a trace/call has a created Link, via the Link button, it must be persisted.  
The size / number of calls should not be relevant for persisted / extended retention of traces here.

- Critical Traces: At least one Trace or Call that include critical errors or log messages with severety Error or FATAL. (Since this can be a lot of traces in an error case, sampling can be used here when they are not viewed. It would be fantastic to have a sampling based on service / error type for those errors to be long term retention)
The size / number of calls can be relevant for persisted / extended retention of traces here to not cause to much data storage. alternatively a more aggressive sampling could be used here.

Side note:

- Make the saved traces only visible in the analzy trace / call view or via the persisted link  if a user is looking for statistics, like in APM metrics, those traces should not be the foundation of metrics, when they passed the 7 days retention time. After 7 days trace retention, metrics are the source of truth for APM views. 

Configuration and Documentation:
Update the Instana documentation to accurately reflect when and how traces are persisted for 365 days.

Business Impact:
Ensuring that critical or frequently viewed traces and calls are retained for 365 days is vital for:

- Long-term error analysis and efficient troubleshooting.

- Maintaining a comprehensive historical record for performance and reliability investigations.

- Enabling teams to review past incidents, identify recurring issues, and improve overall service quality.

Use Case Example:
A) During a critical incident, a development team generates a short link for a trace that meets the criteria for extended retention. The expectation is that this trace remains accessible for 365 days, allowing the team to perform in-depth post-incident analysis. With the current 7-day retention, the loss of this trace compromises the ability to diagnose recurring issues and delays resolution efforts.

B) During a 2 week sprint technical review the team finds suspicious metrics (errors, high latency, error log messages) which did not cause an indicent, the development team want to understand the issue in more detail and looks for traces that meets the criteria at the end of the sprint. The expectation is that they can see some traces for the specific error trace beyond the 7 days for the 2 weeks sprint, allowing the team to perform in-depth analysis to understand the nature of the problem. With the current 7-day retention, the loss of this trace compromises the ability to diagnose infrequent issues and delays resolution efforts.

Conclusion:
We request that Instana implement a solution to automatically persist and ensure 365-day retention for traces and calls that are either critical or have been viewed for a designated period. This enhancement will improve long-term observability, support effective troubleshooting, and enhance overall operational reliability.

Thank you for considering this enhancement request. I look forward to your feedback.

Idea priority High