Status Submitted

Workspace Cloudability

Categories Cloudability Savings Automation

Created by Guest

Created on Nov 6, 2025

Cloudability feature that analyzes Azure Event Hubs and reports over-provisioning and rightsizing recommendations

What We Need

A Cloudability feature that analyzes Azure Event Hubs usage and tells us when we're paying for more capacity than we need (or when we're about to hit limits). Since tier and capacity are set at the namespace level, we need rightsizing recommendations at that level - but with visibility into what each individual event hub is consuming.

Why This Matters

Event Hubs can get expensive fast, especially Premium tier. We need to know:

Are we overprovisioned? (wasting money)
Are we about to hit limits? (performance risk)
Should we split or consolidate namespaces?

Should we switch between Standard and Premium tiers?

Analysis Scope - What to Support

Multi-Level Analysis (Priority Order)

Single Namespace Analysis - Deep dive with individual event hub breakdown
- Shows total namespace utilization
- Breaks down which event hubs are consuming what percentage
- Identifies "noisy neighbors" hogging resources
Batch Namespace Analysis - Multiple namespaces in one report
- Summary view: "15 namespaces analyzed, 8 optimization opportunities, $12K/month savings"
- Drill down into any namespace for details
Subscription-Level - All namespaces in a subscription

Cross-Subscription - All namespaces across multiple subscriptions

Why Individual Event Hub Visibility Matters

Even though we rightsize at the namespace level, we need to see individual event hub usage to make smart decisions:

Example: prod-eventhub-namespace (Premium, 2 PUs, $2,400/month)

├─ orders-hub: 85% of throughput → Maybe needs its own namespace

├─ inventory-hub: 10% of throughput

├─ logging-hub: 3% of throughput → Could consolidate these three

└─ analytics-hub: 2% of throughput into one Standard namespace

What to Collect

Basic Info

Subscription/Account Name
Vendor: Azure
Resource Group
Namespace Name
Current Tier: Standard or Premium
Current Capacity: # of TUs (Standard) or PUs (Premium)
Auto-Inflate Status (Available on Standard Tier only): Enabled/Disabled + max units

Date Range: Minimum 30 days recommended

Metrics to Track

Throughput (Most Important for Rightsizing)

Incoming bytes/sec (ingress) - converted to MB/s
Outgoing bytes/sec (egress) - converted to MB/s
Incoming messages/sec
Outgoing messages/sec
Track for each: Average, Peak, P95, P99

Why these matter:

Standard: 1 TU = 1 MB/s ingress OR 2 MB/s egress (whichever hits first)
Premium: 1 PU ≈ 8 MB/s combined throughput
Your bottleneck is whichever limit you hit first (usually egress on Standard)

Performance Issues

Throttled requests (send + consumer)
Server errors
User errors
Success rate

Connections

Active connections (current, peak, average)
Connections opened/closed per period

Premium Tier Only

CPU usage % (average, peak)
Memory usage % (average, peak)

Additional

Namespace storage utilization

Capture backlog (if using Capture feature)

Rightsizing Thresholds - When to Recommend Changes

Important: All utilization percentages below refer to throughput utilization - the percentage of ingress/egress capacity being used based on the calculations above. Always use the higher of ingress or egress as your constraining metric.

Safe to Downsize (High Confidence)

P95 throughput utilization < 45%
Peak throughput utilization < 65%
Throttling < 0.1% of requests
Sustained for 70%+ of analysis period
Action: Reduce capacity by 20-30%

Example: 10 TUs, P95 egress 7.2 MB/s → 36% utilization → Reduce to 7 TUs

Critical Downsize (Very Safe)

P95 throughput utilization < 30%
Peak throughput utilization < 50%
Zero throttling
Action: Reduce capacity by 30-40%

Example: 10 TUs, P95 egress 4.5 MB/s → 22% utilization → Reduce to 6 TUs

Needs Upsize (Performance Risk)

P95 throughput utilization > 75%
Peak throughput utilization > 90%
Throttling > 1% of requests
Action: Increase capacity by 20-30%

Example: 10 TUs, P95 egress 16.2 MB/s → 81% utilization → Increase to 13 TUs

Critical Upsize (Act Now)

P95 throughput utilization > 85%
Peak throughput utilization > 95%
Throttling > 5% of requests
Sustained high usage > 1 hour
Action: Increase capacity by 40-50% immediately

Example: 10 TUs, Peak egress 19.5 MB/s → 97% utilization → Increase to 15 TUs NOW

Optimal Range (No Changes)

P95 throughput: 55-70%
Peak throughput: 75-85%
Throttling: < 1%

This range provides:

Enough headroom for traffic spikes
Cost efficiency (not grossly overprovisioned)
Performance safety margin

Tier Change Recommendations

Premium → Standard:

P95 < 35% consistently
Predictable, steady workload
No dedicated resource requirements
Potential 40-60% cost savings

Standard → Premium:

Frequent throttling at max TUs
Need predictable performance

CPU/memory constraints on Standard

What Each Recommendation Should Include

Summary View

Current State:

- Namespace: prod-events-ns

- Tier: Premium, 3 PUs

- Current Cost: $3,600/month

- Ingress: 2.1 MB/s (P95), 3.2 MB/s (Peak)

- Egress: 4.3 MB/s (P95), 6.8 MB/s (Peak)

- Combined: 6.4 MB/s (P95), 10.0 MB/s (Peak)

- Capacity: 24 MB/s (3 PUs × 8 MB/s)

- Utilization: 27% (P95), 42% (Peak)

Recommendation:

- Reduce to 2 PUs (Premium)

- New Capacity: 16 MB/s

- New Utilization: 40% (P95), 63% (Peak)

- New Cost: $2,400/month

- Savings: $1,200/month ($14,400/year)

- Confidence: High (95%)

- Buffer: 37% headroom above P95

Idea priority

High

Post comment

By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.

Shape the future of IBM!

Search existing ideas

Post your ideas

Specific links you will want to bookmark for future use

Cloudability feature that analyzes Azure Event Hubs and reports over-provisioning and rightsizing recommendations

Please enter your email address

RELATED IDEAS

Cloudability feature that analyzes Azure Event Hubs and reports over-provisioning and rightsizing recommendations