This is an IBM Automation portal for Cloud Management, Technology Cost Management, Network Automation and AIOps products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post an idea.
Get feedback from the IBM team and other customers to refine your idea.
Follow the idea through the IBM Ideas process.
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
See this idea on ideas.ibm.com
Following the guide on https://www.ibm.com/docs/en/instana-observability/current?topic=applications-monitoring-gpu-public-preview we were able to bring some data into Instana using OpenTelemetry and prometheus format collection and processing of the exported telemetry from Nvidia DCGX. However, specific data is not being labeled or collected by the agent. Also this Opentelemetry customization on a Nvidia appliance architecture is not viable considering that multiple endpoints will need to provide telemetry on the same port to a single Agent installed on the Cluster Manager. Nvidia utilizes a BCM (Bright Cluster Management) + UFM (Unified Fabric Manager) architecture to manage host images that have access to the actual GPUs on the appliance. Installing Instana on the BCM does not give native visility of the Hosts with GPUs, therefore, customization and collection of telemetry using Opentelemetry for environments with hundreds of nodes active under multiple appliances targeting the same port to collect telemetry is not feasible.
We need native support and instrumentation from the Instana agent.
Idea priority | High |
By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.
If the link for the Public Preview documentation does not work the original guide on IBM community can be acessed at https://community.ibm.com/community/user/instana/blogs/yanwei-li/2024/06/14/gpu-observability-with-instana
We need this to move forward with the opportunity. Your support would be greatly appreciated!