Skip to Main Content
Cloud Management and AIOps


This is an IBM Automation portal for Cloud Management, Technology Cost Management, Network Automation and AIOps products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).

Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.

Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Future consideration
Workspace SevOne
Created by Guest
Created on Jun 2, 2026

Enable Application-Level Message Batching in SevOne Data Publisher (SDP) for Array-of-JSON Format

Problem Statement

Current Behavior: SevOne Data Publisher (SDP) currently sends one data point per message to Kafka/Pulsar/Event Hub publishers. While Kafka has protocol-level batching for network efficiency, consumers still receive individual JSON messages.

Example - Current Output:

Message 1: {"deviceId": 1, "objectId": 2, "value": 100, "timestamp": 1234567890}

Message 2: {"deviceId": 1, "objectId": 3, "value": 200, "timestamp": 1234567891}

Message 3: {"deviceId": 1, "objectId": 4, "value": 300, "timestamp": 1234567892}

 

Business Impact:

  • Higher Consumer Processing Overhead: Consumers must process each message individually
  • Increased Network Calls: More frequent consumer polling and processing
  • Higher Kafka Partition Load: More messages = more partition metadata overhead
  • Inefficient for Bulk Processing: Downstream analytics systems prefer batch processing
  • Cost Impact: More messages = higher cloud messaging costs (especially Azure Event Hub)

Proposed Solution

Feature Request: Add configurable application-level batching that groups multiple data points into a single message as a JSON array.

Desired Output:

Message 1: [  {"deviceId": 1, "objectId": 2, "value": 100, "timestamp": 1234567890},  {"deviceId": 1, "objectId": 3, "value": 200, "timestamp": 1234567891},  {"deviceId": 1, "objectId": 4, "value": 300, "timestamp": 1234567892} ]

 

Configuration Parameters:

sdp:  enable-batching: true  batch-size: 100              # Max messages per batch (X)  batch-max-wait-ms: 1000      # Max wait time from first message (Y milliseconds)

 

Batching Logic:

  • Collect up to X messages OR wait Y milliseconds (whichever comes first)
  • Send batch as single message containing JSON array
  • Apply to all publisher types: Kafka, Pulsar, Azure Event Hub

Business Benefits

  1. Reduced Consumer Load: Process 100 messages in one operation vs 100 separate operations
  2. Lower Latency: Fewer network round-trips for consumers
  3. Cost Savings:
    • Azure Event Hub: Charged per message - 100 data points = 1 message instead of 100
    • Kafka: Reduced partition metadata overhead
  4. Better Analytics Performance: Bulk inserts into databases/data lakes
  5. Backward Compatible: Can be disabled for existing deployments
  6. Universal: Works across Kafka, Pulsar, and Event Hub publishers

 

Key Components:

  1. BatchProducer Wrapper: Wraps existing producers with batching logic
  2. Configurable Parameters: batch-size and batch-max-wait-ms
  3. Flush Triggers:
    • Batch full (X messages)
    • Timer expired (Y milliseconds)
    • Graceful shutdown
  4. Error Handling: Batch-level errors propagated to all constituent messages

Use Cases

Use Case 1: High-Volume Monitoring

  • Scenario: 10,000 devices × 100 metrics = 1M data points/minute
  • Current: 1M Kafka messages/minute
  • With Batching: 10K Kafka messages/minute (100x reduction)
  • Benefit: Massive cost savings on Azure Event Hub

Use Case 2: Analytics Pipeline

  • Scenario: Streaming data to Elasticsearch/Splunk
  • Current: Individual inserts (slow)
  • With Batching: Bulk inserts (10-100x faster)
  • Benefit: Real-time dashboards with lower latency

Use Case 3: Cloud Cost Optimization

  • Scenario: Azure Event Hub charges per message
  • Current: $X per million messages
  • With Batching: $X/100 per million data points
  • Benefit: Direct cost reduction

Competitive Analysis

Industry Standard:

  • Telegraf: Supports batching with metric_batch_size and metric_buffer_limit
  • Logstash: Has batch_size and batch_delay for output plugins
  • Fluentd: Supports buffering and batching
  • Datadog Agent: Batches metrics before sending

SDP Gap: Currently lacks application-level batching for array-of-JSON format

Customer Impact

Priority: High

Affected Customers:

  • All customers using SDP with high-volume data collection
  • Customers using Azure Event Hub (cost-sensitive)
  • Customers with analytics pipelines requiring bulk processing
  • Customers with downstream systems that prefer batched data

Workaround Complexity: High

  • Requires custom consumer-side batching logic
  • Increases consumer complexity and maintenance
  • No control over batch size from SDP side

Testing Requirements

Functional Testing:

  • Batch fills to X messages → flushes
  • Timer expires at Y ms → flushes
  • Graceful shutdown → flushes remaining batch
  • Error handling → all jobs notified

Performance Testing:

  • Throughput: Compare batched vs non-batched
  • Latency: Measure end-to-end delay
  • Memory: Monitor batch buffer usage

Integration Testing:

  • Kafka producer with batching
  • Pulsar producer with batching
  • Event Hub producer with batching

Backward Compatibility:

  • Existing configs work without changes
  • Batching disabled by default


Documentation Requirements

Configuration Guide:

  • How to enable batching
  • Parameter tuning guidelines
  • Performance considerations

Migration Guide:

  • Consumer changes needed (parse JSON array)
  • Rollback procedure

Best Practices:

  • Recommended batch sizes for different scenarios
  • Latency vs throughput tradeoffs

Success Metrics

KPIs to Track:

  1. Message Reduction: % reduction in Kafka messages sent
  2. Consumer Performance: Processing time improvement
  3. Cost Savings: Azure Event Hub cost reduction
  4. Adoption Rate: % of customers enabling batching
  5. Customer Satisfaction: Feedback scores

Target Goals:

  • 50-90% reduction in message count (depending on batch size)
  • 30-50% improvement in consumer throughput
  • 50-90% cost reduction for Event Hub customers
Idea priority High