This is an IBM Automation portal for Cloud Management, Technology Cost Management, Network Automation and AIOps products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post an idea.
Get feedback from the IBM team and other customers to refine your idea.
Follow the idea through the IBM Ideas process.
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
The MVP for a monitor should include this:
1. System Health Monitoring
The system must collect CPU, memory, network usage, and storage I/O from each MinIO node.
The system must track process-level health (running state, crashes, unexpected exits).
The solution must detect pod/server restarts and abnormal termination codes.
2. MinIO Service Health
The system shall track MinIO’s internal service status (online/offline).
The system must detect degraded nodes or uneven workloads.
The system must detect and report drive health issues (missing, unmounted, offline drives).
3. S3 Operation Metrics
The system must gather counts of S3 operations grouped by:
Operation type (PUT, GET, DELETE, etc.)
Success vs. error codes (2xx, 4xx, 5xx)
The system must capture performance indicators:
Request latency
Request throughput (objects/sec, bytes/sec)
4. Storage & Capacity Metrics
The system must gather capacity data:
Used vs. available storage per drive / per node
Total cluster capacity
The system must track object count trends.
The system shall detect imbalanced usage across disks or nodes.
5. Data Durability / Healing
The system must expose status and progress of healing operations.
The system must detect when healing:
Starts
Is taking unusually long
Encounters failures or corrupted blocks
6. Alerting Requirements
The system must generate alerts for conditions such as:
Node or disk failure
High error rates or failed S3 operations
Slow read/write latency
Capacity thresholds (70%, 80%, 90%)
Healing failures or extended healing duration
Security events (multiple failed logins, access anomalies)
This idea will be implemented in 2H, 2026.
It is not on our roadmap for the first half of 2025, but it will be reconsidered thereafter.