This is an IBM Automation portal for Cloud Management and AIOps products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updateson them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post an idea.
Get feedback from the IBM team and other customers to refine your idea.
Follow the idea through the IBM Ideas process.
Specific links you will want to bookmark for future use
Despite setting more than 3 times the heap size for JAVA_MAX_MEM (eg. 512m) over the nominal memory usage (eg. average below 140m) as observed in a production environment, it can happen under peak loads or several circumstances that the agent gets into OOM. To recover from such issue, a manual systemctl restart of the service yield to restoration of the service to "normal" resource consumption usage.
It is worth noting that automatic restart depends on the how the agent got started. An agent running via kubernetes could be restarted based on the configuration. But when the agent is extracted and/or installed via package them the agent is not able to recover from a hard crash like OOM.
The proposal is to for the agent to detect OOM status to self-recover from OOM crash. The systemctl service has auto-restart service unit but the agent does not seem to return correct failure code under OOM situation. It would be desired to have auto-recovery means for the agent.
The benefit of doing so would be:
prevent manual restart operation (toil)
prevent "out-of-the-band" automation script to recover (ie clean self recovery)
Do not place IBM confidential, company confidential, or personal information into any field.