Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion content/integrate/redis-data-integration/observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,25 @@ RDI reports with their descriptions.
| `monitor_time_elapsed_created` | Gauge | Timestamp when the monitor time elapsed counter was created | Informational - no alerting needed |
| `rdi_incoming_entries` | Gauge | Count of incoming events by `data_source` and `operation` type (pending, inserted, updated, deleted, filtered, rejected) | Informational - monitor for trends, alert only on "rejected" > 0 |
| `rdi_stream_event_latency_ms` | Gauge | Latency in milliseconds of the oldest event in each data stream, labeled by `data_source` | Informational - monitor based on business SLA requirements |
| **Processor Performance Total Metrics** | | | |
| `rdi_processed_batches_total` | Counter | Total number of processed batches | Informational - use for data ingestion and load tracking |
| `rdi_processor_batch_size_total` | Counter | Total batch size across all processed batches | Informational - use for throughput analysis |
| `rdi_processor_read_time_ms_total` | Counter | Total read time in milliseconds across all batches | Informational - use for performance analysis |
| `rdi_processor_transform_time_ms_total` | Counter | Total transform time in milliseconds across all batches | Informational - use for performance analysis |
| `rdi_processor_write_time_ms_total` | Counter | Total write time in milliseconds across all batches | Informational - use for performance analysis |
| `rdi_processor_process_time_ms_total` | Counter | Total process time in milliseconds across all batches | Informational - use for performance analysis |
| `rdi_processor_ack_time_ms_total` | Counter | Total acknowledgment time in milliseconds across all batches | Informational - use for performance analysis |
| `rdi_processor_total_time_ms_total` | Counter | Sum of the total `read_time`, `process_time` and `ack_time` values in milliseconds across all batches | Informational - use for performance analysis |
| `rdi_processor_rec_per_sec_total` | Gauge | Total records per second across all batches | Informational - use for throughput analysis |
| **Processor Performance Last Batch Metrics** | | | |
| `rdi_processor_batch_size_last` | Gauge | Last batch size processed | Informational - use for real-time monitoring |
| `rdi_processor_read_time_ms_last` | Gauge | Last batch read time in milliseconds | Informational - use for real-time performance monitoring |
| `rdi_processor_transform_time_ms_last` | Gauge | Last batch transform time in milliseconds | Informational - use for real-time performance monitoring |
| `rdi_processor_write_time_ms_last` | Gauge | Last batch write time in milliseconds | Informational - use for real-time performance monitoring |
| `rdi_processor_process_time_ms_last` | Gauge | Last batch process time in milliseconds | Informational - use for real-time performance monitoring |
| `rdi_processor_ack_time_ms_last` | Gauge | Last batch acknowledgment time in milliseconds | Informational - use for real-time performance monitoring |
| `rdi_processor_total_time_ms_last` | Gauge | Last batch total time in milliseconds | Informational - use for real-time performance monitoring |
| `rdi_processor_rec_per_sec_last` | Gauge | Last batch records per second | Informational - use for real-time throughput monitoring |

{{< note >}}
**Additional information about stream processor metrics:**
Expand All @@ -121,11 +140,14 @@ RDI reports with their descriptions.
- Metrics with the `_created` suffix are automatically generated by Prometheus for counters and gauges to track when they were first created.
- The `rdi_incoming_entries` metric provides a detailed breakdown for each data source by operation type.
- The `rdi_stream_event_latency_ms` metric helps monitor data freshness and processing delays.
- The processor performance metrics are divided into two categories:
- **Total metrics**: Accumulate values across all processed batches for historical analysis
- **Last batch metrics**: Show real-time performance data for the most recently processed batch
{{< /note >}}

## Recommended alerting strategy

The alerting strategy described in the sections below focuses on system failures and data integrity issues that require immediate attention. Most ther metrics are informational, so you should monitor them for trends rather than trigger alerts.
The alerting strategy described in the sections below focuses on system failures and data integrity issues that require immediate attention. Most other metrics are informational, so you should monitor them for trends rather than trigger alerts.

### Critical alerts (immediate response required)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4627,6 +4627,22 @@
10.5
]
},
"transform_time_avg": {
"type": "number",
"minimum": 0.0,
"title": "Transform Time Avg",
"examples": [
2.3
]
},
"write_time_avg": {
"type": "number",
"minimum": 0.0,
"title": "Write Time Avg",
"examples": [
4.4
]
},
"process_time_avg": {
"type": "number",
"minimum": 0.0,
Expand Down Expand Up @@ -4665,6 +4681,8 @@
"total_batches",
"batch_size_avg",
"read_time_avg",
"transform_time_avg",
"write_time_avg",
"process_time_avg",
"ack_time_avg",
"total_time_avg",
Expand Down