Skip to content

[augment/human] Add opentelemetry#45

Merged
p14n merged 2 commits intomainfrom
augment-opentelemetry
Apr 6, 2025
Merged

[augment/human] Add opentelemetry#45
p14n merged 2 commits intomainfrom
augment-opentelemetry

Conversation

@p14n
Copy link
Copy Markdown
Owner

@p14n p14n commented Apr 4, 2025

Summary by Sourcery

Add OpenTelemetry instrumentation to the message broker system to enable observability and tracing

New Features:

  • Introduced OpenTelemetry integration for tracing and metrics in the message broker system
  • Added telemetry configuration and metrics tracking for message publishing and subscribing

Enhancements:

  • Refactored message broker constructors to include telemetry configuration
  • Added event ID tracking and span attributes for better observability

Build:

  • Added OpenTelemetry dependencies to the project build configuration

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Apr 4, 2025

Reviewer's Guide by Sourcery

This pull request introduces OpenTelemetry instrumentation to the post-event system. It adds metrics for message brokers and tracing spans to track message publishing, receiving, and subscriber activity. The changes include adding new dependencies, creating a telemetry configuration, updating broker implementations, and modifying client and server components to use OpenTelemetry.

Sequence diagram for publishing a message with tracing

sequenceDiagram
    participant Broker as DefaultMessageBroker
    participant Tracer
    participant Subscriber

    Broker->>Tracer: startSpan("publish_message")
    activate Tracer
    Tracer-->>Broker: Span
deactivate Tracer
    Broker->>Subscriber: async submit
    activate Subscriber
    Subscriber->>Tracer: startSpan("process_message")
    activate Tracer
    Tracer-->>Subscriber: ChildSpan
deactivate Tracer
    Subscriber->>Subscriber: onMessage(convert(message))
    Subscriber->>Tracer: end()
    activate Tracer
    deactivate Tracer
    Subscriber-->>Broker: return
    deactivate Subscriber
    Broker->>Tracer: end()
    activate Tracer
    deactivate Tracer
Loading

Updated class diagram for DefaultMessageBroker

classDiagram
    class DefaultMessageBroker {
        -ConcurrentHashMap~String, Set~MessageSubscriber~OutT~~~ topicSubscribers
        -AtomicBoolean closed
        -AsyncExecutor asyncExecutor
        -BrokerMetrics metrics
        -Tracer tracer
        +DefaultMessageBroker(TelemetryConfig telemetryConfig)
        +DefaultMessageBroker(AsyncExecutor asyncExecutor, TelemetryConfig telemetryConfig)
        #canProcess(String topic, InT message) bool
        +publish(String topic, InT message)
        +subscribe(String topic, MessageSubscriber~OutT~ subscriber) bool
        +unsubscribe(String topic, MessageSubscriber~OutT~ subscriber) bool
        +close()
        #getEventId(InT message) String
    }
    class BrokerMetrics {
      -LongCounter publishedMessages
      -LongCounter receivedMessages
      -LongUpDownCounter activeSubscribers
      +BrokerMetrics(Meter meter)
      +recordPublished(String topic)
      +recordReceived(String topic)
      +recordSubscriberAdded(String topic)
      +recordSubscriberRemoved(String topic)
    }
    class TelemetryConfig{
      <<interface>>
      +getMeter() Meter
      +getTracer() Tracer
      +getOpenTelemetry() OpenTelemetry
    }
    DefaultMessageBroker -- BrokerMetrics : has
    DefaultMessageBroker -- TelemetryConfig : uses
Loading

Updated class diagram for TransactionalBroker

classDiagram
    class TransactionalBroker {
        -DataSource ds
        +TransactionalBroker(DataSource ds, AsyncExecutor asyncExecutor, TelemetryConfig telemetryConfig)
        +TransactionalBroker(DataSource ds, TelemetryConfig telemetryConfig)
        +publish(String topic, Event message)
        +convert(Event m) TransactionalEvent
        +getEventId(Event message) String
    }
    TransactionalBroker --|> DefaultMessageBroker
Loading

Updated class diagram for EventMessageBroker

classDiagram
    class EventMessageBroker {
        +EventMessageBroker(AsyncExecutor asyncExecutor, TelemetryConfig telemetryConfig)
        +EventMessageBroker(TelemetryConfig telemetryConfig)
        +convert(Event m) Event
        +getEventId(Event message) String
    }
    EventMessageBroker --|> DefaultMessageBroker
Loading

Updated class diagram for SystemEventBroker

classDiagram
    class SystemEventBroker {
        +SystemEventBroker(AsyncExecutor asyncExecutor, TelemetryConfig telemetryConfig)
        +SystemEventBroker(TelemetryConfig telemetryConfig)
        +convert(SystemEvent m) SystemEvent
        +subscribe(MessageSubscriber~SystemEvent~ subscriber)
        +getEventId(SystemEvent message) String
    }
    SystemEventBroker --|> DefaultMessageBroker
Loading

Updated class diagram for ConsumerClient

classDiagram
    class ConsumerClient {
        -List~AutoCloseable~ closeables
        -TransactionalBroker tb
        -SystemEventBroker seb
        -TelemetryConfig telemetryConfig
        +ConsumerClient(TelemetryConfig telemetryConfig, AsyncExecutor asyncExecutor)
        +ConsumerClient(TelemetryConfig telemetryConfig)
        +start(Set~String~ topics, DataSource ds, String host, int port)
        +start(Set~String~ topics, DataSource ds, ManagedChannel channel)
        +close()
    }
Loading

Updated class diagram for LocalPersistentConsumer

classDiagram
    class LocalPersistentConsumer {
        -DataSource ds
        -PostEventConfig cfg
        -AsyncExecutor asyncExecutor
        -TransactionalBroker tb
        -List~AutoCloseable~ closeables
        -TelemetryConfig telemetryConfig
        +LocalPersistentConsumer(DataSource ds, PostEventConfig cfg, AsyncExecutor asyncExecutor, TelemetryConfig telemetryConfig)
        +LocalPersistentConsumer(DataSource ds, PostEventConfig cfg, TelemetryConfig telemetryConfig)
        +start() void
        +close() void
    }
Loading

Updated class diagram for ConsumerServer

classDiagram
    class ConsumerServer {
        -List~AutoCloseable~ closeables
        -Server server
        -AsyncExecutor asyncExecutor
        -TelemetryConfig telemetryConfig
        +ConsumerServer(DataSource ds, ConfigData cfg, TelemetryConfig telemetryConfig)
        +ConsumerServer(DataSource ds, ConfigData cfg, AsyncExecutor asyncExecutor, TelemetryConfig telemetryConfig)
        +start(int port) void
        +start(ServerBuilder~?~ sb) void
        +close() void
    }
Loading

Updated class diagram for MessageBrokerGrpcClient

classDiagram
    class MessageBrokerGrpcClient {
        -ManagedChannel channel
        +MessageBrokerGrpcClient(TelemetryConfig telemetryConfig, String host, int port)
        +MessageBrokerGrpcClient(TelemetryConfig telemetryConfig, ManagedChannel channel)
    }
Loading

Class diagram for TelemetryConfig and DefaultTelemetryConfig

classDiagram
    class DefaultTelemetryConfig {
        -String INSTRUMENTATION_NAME
        -OpenTelemetry openTelemetry
        -Meter meter
        -Tracer tracer
        +DefaultTelemetryConfig(String serviceName)
        +getMeter() Meter
        +getTracer() Tracer
        +getOpenTelemetry() OpenTelemetry
    }
    class TelemetryConfig{
      <<interface>>
      +getMeter() Meter
      +getTracer() Tracer
      +getOpenTelemetry() OpenTelemetry
    }
    TelemetryConfig <|-- DefaultTelemetryConfig
Loading

File-Level Changes

Change Details Files
Introduces OpenTelemetry instrumentation for message brokers to track message publishing, receiving, and subscriber activity.
  • Adds opentelemetry-api, opentelemetry-sdk, and opentelemetry-exporter-otlp dependencies to the build.gradle file.
  • Adds opentelemetry-grpc-1.6 for gRPC instrumentation.
  • Adds DefaultTelemetryConfig to configure OpenTelemetry SDK with a service name.
  • Adds BrokerMetrics to record metrics for published messages, received messages, and active subscribers.
  • Adds tracing spans to DefaultMessageBroker and TransactionalBroker for message publishing and processing.
  • Records metrics for message publishing, receiving, subscriber addition, and subscriber removal in DefaultMessageBroker and TransactionalBroker.
  • Passes TelemetryConfig to broker constructors.
  • Adds a getEventId abstract method to DefaultMessageBroker to allow subclasses to define how to extract an event ID from a message for tracing purposes.
  • Implements getEventId in TransactionalBroker, EventMessageBroker, and SystemEventBroker.
build.gradle
src/main/java/com/p14n/postevent/broker/DefaultMessageBroker.java
src/main/java/com/p14n/postevent/broker/TransactionalBroker.java
src/main/java/com/p14n/postevent/telemetry/DefaultTelemetryConfig.java
src/main/java/com/p14n/postevent/telemetry/BrokerMetrics.java
src/main/java/com/p14n/postevent/telemetry/TelemetryConfig.java
src/main/java/com/p14n/postevent/broker/EventMessageBroker.java
src/main/java/com/p14n/postevent/broker/SystemEventBroker.java
Updates client and server components to use OpenTelemetry for tracing and metrics.
  • Adds TelemetryConfig to ConsumerClient and ConsumerServer constructors.
  • Passes TelemetryConfig to broker constructors in ConsumerClient and ConsumerServer.
  • Adds TelemetryConfig to MessageBrokerGrpcClient constructor.
  • Adds TelemetryConfig to LocalPersistentConsumer constructor.
  • Passes TelemetryConfig to broker constructors in LocalPersistentConsumer.
  • Updates tests to use DefaultTelemetryConfig.
src/main/java/com/p14n/postevent/ConsumerClient.java
src/main/java/com/p14n/postevent/ConsumerServer.java
src/main/java/com/p14n/postevent/broker/grpc/MessageBrokerGrpcClient.java
src/main/java/com/p14n/postevent/LocalPersistentConsumer.java
src/test/java/com/p14n/postevent/DeterministicConsumerTest.java
src/test/java/com/p14n/postevent/CatchupServiceTest.java
src/test/java/com/p14n/postevent/DefaultMessageBrokerTest.java
src/test/java/com/p14n/postevent/MessageBrokerGrpcIntegrationTest.java
src/test/java/com/p14n/postevent/example/RemoteConsumerExample.java
src/test/java/com/p14n/postevent/PersistentBrokerTest.java
src/test/java/com/p14n/postevent/example/LocalPersistentConsumerExample.java
src/test/java/com/p14n/postevent/LocalConsumerTest.java
src/test/java/com/p14n/postevent/example/LocalConsumerExample.java
Adds logging to PersistentBroker when publishing a catchup required event.
  • Adds a debug log message to PersistentBroker when publishing a catchup required event.
src/main/java/com/p14n/postevent/catchup/PersistentBroker.java

Possibly linked issues

  • #0: The PR implements the issue by adding OpenTelemetry instrumentation to the message broker.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @p14n - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider using a dedicated thread pool for telemetry to avoid impacting the performance of the main application logic.
  • The added metrics look good, but consider adding more context-specific attributes to the spans for better observability.
Here's what I looked at during the review
  • 🟡 General issues: 2 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 2 issues found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.


metrics.recordPublished(topic);

Span span = tracer.spanBuilder("publish_message")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Parent span is not ended.

The span created for publishing messages is started but never ended. This may lead to hanging spans in your telemetry. Consider ensuring span.end() is called after the subscriber tasks are submitted, for example by wrapping the publish logic in a try/finally block.


metrics.recordPublished(topic);

Span span = tracer.spanBuilder("publish_message_tx")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Transactional parent span is never ended.

Similar to the non-transactional broker, the parent span in the transactional publish method is started but not ended. Adding an appropriate span.end() call will ensure proper trace completion.

Set<MessageSubscriber<OutT>> subscribers = topicSubscribers.get(topic);
if (subscribers != null) {
for (MessageSubscriber<OutT> subscriber : subscribers) {
asyncExecutor.submit(() -> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting the telemetry boilerplate from the lambda into a helper method to reduce nesting and complexity in the publish method, improving readability and maintainability without altering functionality..

Consider extracting the telemetry boilerplate (span creation, scoping, and error handling) out of the lambda to a helper. For example, extract the processing of a subscriber’s message into a private method:

```java
private boolean processMessageWithTelemetry(MessageSubscriber<OutT> subscriber, String topic, InT message) {
    Span childSpan = tracer.spanBuilder("process_message")
            .setAttribute("topic", topic)
            .setAttribute("event.id", getEventId(message))
            .startSpan();
    try (Scope childScope = childSpan.makeCurrent()) {
        subscriber.onMessage(convert(message));
        metrics.recordReceived(topic);
        return true;
    } catch (Exception e) {
        childSpan.recordException(e);
        try {
            subscriber.onError(e);
        } catch (Exception ignored) {
            // ignore error handling failure
        }
        return false;
    } finally {
        childSpan.end();
    }
}

Then in publish(), replace the inner lambda code with a call to this helper:

@Override
public void publish(String topic, InT message) {
    if (!canProcess(topic, message)) {
        return;
    }
    metrics.recordPublished(topic);
    Span span = tracer.spanBuilder("publish_message")
            .setAttribute("topic", topic)
            .setAttribute("event.id", getEventId(message))
            .startSpan();
    try {
        Set<MessageSubscriber<OutT>> subscribers = topicSubscribers.get(topic);
        if (subscribers != null) {
            for (MessageSubscriber<OutT> subscriber : subscribers) {
                asyncExecutor.submit(() -> processMessageWithTelemetry(subscriber, topic, message));
            }
        }
    } finally {
        span.end();
    }
}

This refactoring extracts the telemetry logic out of the lambda, reducing the nesting and overall complexity, while preserving all functionality.

// Deliver to all subscribers
for (MessageSubscriber<TransactionalEvent> subscriber : topicSubscribers.get(topic)) {
try (Connection c = ds.getConnection()) {
var op = new OrderedProcessor((connection, event) -> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting the telemetry logic into helper methods to reduce nested try-catch-finally blocks and improve code clarity by centralizing span management and exception handling, for example, by using a dedicated method for child span handling and refactoring the lambda expression in the publish method to use this method, which reduces duplication and nesting depth.

Extracting the telemetry logic into helper methods can reduce the nested try-catch-finally blocks and clarify the core message processing. For example, you could extract the child span handling into a dedicated method:

private boolean processWithTelemetry(String spanName, String topic, String eventId, Supplier<Boolean> action) {
    Span span = tracer.spanBuilder(spanName)
            .setAttribute("topic", topic)
            .setAttribute("event.id", eventId)
            .startSpan();
    try (Scope scope = span.makeCurrent()) {
        return action.get();
    } catch (Exception e) {
        span.recordException(e);
        throw e;
    } finally {
        span.end();
    }
}

Then, in your publish method, you could refactor the lambda as follows:

var op = new OrderedProcessor((connection, event) -> 
    processWithTelemetry("process_message", topic, getEventId(message), () -> {
        subscriber.onMessage(new TransactionalEvent(connection, event));
        metrics.recordReceived(topic);
        return true;
    })
);

This approach centralizes the telemetry span creation and ending logic, reducing duplication and the depth of nested exception handling.

@p14n p14n merged commit 8d76434 into main Apr 6, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant