Skip to Main Content
Cloud Management and AIOps


This is an IBM Automation portal for Cloud Management and AIOps products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).

Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.

Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Workspace ITM/IBM APM
Created by Guest
Created on Sep 7, 2021

ITCAM MS SQL Server Agent should self-recover the Collector on SQL Server recycle

The background and the issue to solve:
Currently MS SQL agent version V6.3.1.20 fails to return data to TEPS after the SQL Server Instance has stopped and then been started again. This occurs if the SQL Server instance has stopped for more than the default 3 minutes. This happens because the Collector part of the agent needs the SQL Instance to be available to connect to. By default it will attempt to start and connect 3 times at 1 minute intervals and then shutdown.

Possible manual mitigation
It is possible to mitigate this by extending the retry interval via an Override Local Value setting of COLL_MSSQL_RETRY_INTERVAL and/or COLL_MSSQL_RETRY_CNT environment variables. But these must be done via MTEMS UI or running TACMD ConfigureSystem post install. Also how long do you set the interval to? Alternatively a situation can be created to detect that the Collector is stopped and the SQL Instance running and take an action to start the collector. Both of these will help ease the issue, but in a large enterprise organization with several hundred instances to monitor neither is optimal.

The Requested change/Enhancement

The core agent KOQAgent_<Instance_name> will respond to the scenario where teh collector KOQCOLL_<instance_name> is stopped, has exceeded the retry interval (reached the retry count default or overriden limit) and the SQL instance <instance_name> is running. The response will be to attempt to restart the KOQCOLL_<instance_name>. The default behaviour will be to run this process once then fire a situation to the TEPS to allow the position to be alerted on to the installation ITSM tooling.

The default behaviour should be configurable via silent_install response files or directly in config files such that changes can be picked up by a recycle if the agent, or pushed via TACMD. The behaviour should be configurable to change from a default of 1 attempt before alerting to switch it off, or make a large number of attempts as might be specified by the installation. The ability to alert should be able to be toggled on or off, or set to alert every 'n' tries - where 'n' is not larger than the maximum retries set. The alert should include details of the instance, hostname and number of attempts made to start the Collector.

Idea priority High