As an IBM APM V8 and IBM Instana customer, this request is to ask for a "backup and restore capability for all configured alerts and events" to support operational continuity, disaster recovery, and change management in large environments where hundreds of alert and event definitions are maintained.
Requested enhancement
=======================
Introduce a built-in feature in IBM Instana to:
- Backup / export all configured Smart Alerts and Events (including their full definitions, scopes, conditions, thresholds, severities, and notification settings).
- Restore / import those definitions on the same or a different Instana backend (for example, after a failure, migration, or environment refresh).
- Support running backup automatically on a schedule, as well as on demand.
Detailed requirements
======================
1. Comprehensive export / backup of alert and event definitions
- Export all alert and event configurations in a structured format (for example, JSON or YAML) that can be version-controlled and stored securely.
- Include all metadata such as: names, descriptions, scopes, filters, conditions, thresholds, severity levels, notification channels, and any dependencies.
- Support full export (all alerts/events) and filtered export (for example, by application, team, or tag).
2. Reliable import / restore capability
- Allow restoring/exported alert and event definitions to:
- The same Instana backend (for rollback after misconfiguration or accidental deletion).
- A different backend (for DR, staging-to-production promotion, or migration to new servers).
- Provide safety options such as dry-run / preview, duplicate handling rules, and the ability to selectively import specific alerts or groups.
3. Scheduling and automation
- Support scheduled backups (for example, daily, weekly) configured within Instana or accessible via API for external schedulers.
- Make backups scriptable/automatable so enterprises can integrate them into existing backup and DevOps pipelines.
Rationale and customer impact
==============================
- Our environment has hundreds of alerts and events defined, representing significant operational knowledge and tuning effort. Losing these configurations or having to recreate them manually after an incident or migration would be extremely costly and error-prone.
- A robust backup/restore (export/import) mechanism ensures that, if something goes wrong with the backend servers or configurations, operations can quickly resume with the original alert and event definitions on the same or new backend instances.
- This is a basic expectation for enterprise observability platforms and would be widely useful to many customers with large-scale, regulated, or mission-critical environments.