The XMTP SDK sends comprehensive performance metrics to Datadog to track various aspects of SDK operations. This document provides a detailed overview of our monitoring infrastructure.
Dashboard URL: Workflow Dashboard
This dashboard tracks our three continuous workflows and is connected to the #notify-eng-testing
Slack channel for real-time alerts and notifications.
Dashboard URL: SDK Performance Dashboard
The primary metric sent to Datadog is xmtp.sdk.duration
, which measures execution time across various SDK operations.
// Send main operation metric
const durationMetricName = `xmtp.sdk.duration`;
metrics.gauge(durationMetricName, value, [
`libxmtp:${firstPersona.version}`,
`operation:${operationName}`,
`test:${testName}`,
`metric_type:operation`,
`description:${metricDescription}`,
`members:${members}`,
]);
Each metric uses the following tags to enable precise filtering and analysis:
Tag | Description |
---|---|
libxmtp |
XMTP library version |
operation |
Specific operation being measured |
test |
Test name |
metric_type |
"operation" or "network" |
description |
Additional context |
members |
Number of members (for group operations) |
In addition to core operation metrics, we track detailed network performance:
// Handle network stats
if (!skipNetworkStats) {
const networkStats = await getNetworkStats();
for (const [statName, statValue] of Object.entries(networkStats)) {
const metricValue = statValue * 1000; // Convert to milliseconds
metrics.gauge(durationMetricName, metricValue, [
`libxmtp:${firstPersona.version}`,
`operation:${operationName}`,
`test:${testName}`,
`metric_type:network`,
`network_phase:${statName.toLowerCase().replace(/\s+/g, "_")}`,
]);
}
}
For implementation details, see the network stats function
We track five key network phases:
Phase | Description |
---|---|
dns_lookup |
DNS resolution time |
tcp_connection |
TCP connection establishment time |
tls_handshake |
TLS handshake duration |
server_call |
Server response time |
processing |
Server processing time (calculated as server_call - tls_handshake) |
Dashboard URL: SDK Delivery Dashboard
Beyond performance metrics, we track message delivery reliability through the xmtp.sdk.delivery_rate
metric:
// Send delivery rate metric
metrics.gauge("xmtp.sdk.delivery_rate", deliveryRate, [
`libxmtp:${firstPersona.version}`,
`test:${testName}`,
`metric_type:reliability`,
`members:${members}`,
]);
The delivery dashboard provides critical insights into message reliability:
-
Message Delivery Rate (%)
- 🟢 Green: ≥ 99.9% delivery rate
- 🟡 Yellow: ≥ 99% delivery rate
- 🔴 Red: < 99% delivery rate
-
Delivery Trends - Historical view of delivery rates to identify patterns or regressions
The delivery dashboard supports comprehensive filtering by:
Filter | Description | Default |
---|---|---|
Environment (env ) |
Deployment environment | dev |
Geographic region (geo ) |
Server location | us |
Test name (test ) |
Specific test scenario | ts_delivery |
Library version (libxmtp ) |
XMTP SDK version | - |
Members (members ) |
Number of participants | - |
Preset Views:
- US region performance
- South America region performance