Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micrometer tracing #451

Open
3 tasks done
adamalexandru4 opened this issue Dec 11, 2023 · 10 comments
Open
3 tasks done

Micrometer tracing #451

adamalexandru4 opened this issue Dec 11, 2023 · 10 comments

Comments

@adamalexandru4
Copy link

  • I am running the latest version
  • I checked the documentation and found no answer
  • I checked to make sure that this issue has not already been filed

Expected Behavior

Micrometer tracing available on each task execution.

Would be great if the ExecutePicked would be wrapped in a tracing context: One traceId, and different spans for complete or failure or even the same spanId.

Current Behavior

No tracing available

@adamalexandru4
Copy link
Author

@kagkarlsson If you agree with this idea, I can contribute the required changes.

@kagkarlsson
Copy link
Owner

I think that makes sense yes, as long as the code is acceptable. Not sure if we have all we need for it, I was thinking it could be implemented with what is now called StatsRegistry (might be promoted to some sort of Listener interface)

@adamalexandru4
Copy link
Author

The starting idea would be to play around with Tracer interface from micrometer-tracing inside the methods of ExecutePicked class. By default the Scheduler, would have a Tracer.NOOP implementation but in Spring Boot module we could try to detect the bean from ApplicationContext and inject it.

private void executePickedExecution(Execution execution, CurrentlyExecuting currentlyExecuting) { 
    .....
    Span taskExecutionSpan = tracer.nextSpan().name("task-execution")
      .tag("task-name", execution.taskInstance.getTaskName());
    try (Tracer.SpanInScope spanInScope = tracer.withSpan(taskExecutionSpan)) {
      ....
    } catch (..) {
      .....
   }  finally {
      taskExecutionSpan.end();
    }
 }

private void complete(CompletionHandler completion, Execution execution, Instant executionStarted) {
    ....
    Span taskCompleteSpan = tracer.nextSpan().name("task-complete")
      .tag("task-name", execution.taskInstance.getTaskName());
    try (Tracer.SpanInScope spanInScope = tracer.withSpan(taskCompleteSpan)){
        .... 
    } catch (..) {
      .....
   }  finally {
      taskCompleteSpan.end();
    }


private void failure(
      Task task,
      Execution execution,
      Throwable cause,
      Instant executionStarted,
      String errorMessagePrefix) {
    ....
    Span taskFailureSpan = tracer.nextSpan().name("task-failure")
      .tag("task-name", execution.taskInstance.getTaskName());
    try (Tracer.SpanInScope spanInScope = tracer.withSpan(taskCompleteSpan)){
        .... 
    } catch (..) {
      .....
   }  finally {
      taskFailureSpan.end();
    }

@NicklasWallgren
Copy link
Contributor

NicklasWallgren commented May 31, 2024

We would also benefit from this, maybe OpenTelemetry could be a viable option instead of Micrometer Tracing?

I don't think it's possible to implement tracing using the StatsRegistry in its current form. It would require additional events in ExecutionStatsEvent (i.e. STARTED), and passing Execution to StatsRegistry#register so that we can add span attributes such as taskName and id.

@kagkarlsson
Copy link
Owner

A loose plan is to deprecate StatsRegistry and replace it with one or many Listener interfaces, also covering started and completed events. If anyone would like to take a stab at that (adding some form of adapter for StatsRegistry), feel free to try

@NicklasWallgren
Copy link
Contributor

NicklasWallgren commented May 31, 2024

I've created a basic proof of concept that divides ExecutePicked into two versions: DefaultExecutePicked and ObservableExecutePicked. You can review the changes in the commit linked below:
NicklasWallgren@5e2d23d

However, a more efficient approach might be to transform StatsRegistry into a comprehensive listener interface, which would prevent multiple implementations of ExecutePicked

@kagkarlsson
Copy link
Owner

How does #504 work for this?
Specifically:

  void onExecutionStart(CurrentlyExecuting currentlyExecuting);
  void onExecutionComplete(ExecutionComplete executionComplete);

@kagkarlsson
Copy link
Owner

@NicklasWallgren Any chance you could try reimplementing your PoC using the new SchedulerListener?

@NicklasWallgren
Copy link
Contributor

NicklasWallgren commented Jul 3, 2024

@NicklasWallgren Any chance you could try reimplementing your PoC using the new SchedulerListener?

Great! It should possible to implement tracing with help of the new SchedulerListener.

I've implemented a temporary ExecutionHandler-wrapper, which bootstraps the trace/transaction before kicking of the delegated/actual execution handler.

public class ElasticApmExecutionHandler<T extends ObservableTaskContext> implements ExecutionHandler<T> {
    private static final String TRANSACTION_TYPE = "queue.task";

    private final ExecutionHandler<T> executionHandler;

    private ElasticApmExecutionHandler(final ExecutionHandler<T> executionHandler) {
        this.executionHandler = executionHandler;
    }

    @Override
    public CompletionHandler<T> execute(final TaskInstance<T> taskInstance, final ExecutionContext executionContext) {
        final Execution execution = executionContext.getExecution();

        final T data = taskInstance.getData();

        final Transaction transaction = ElasticApm
            .startTransactionWithRemoteParent((it) -> TraceIdHeaderValueFactory.create(data.traceId(), data.spanId()));

        try (final Scope scope = transaction.activate()) {
            transaction.setName("QueueTask#" + execution.getTaskName());
            transaction.setType(TRANSACTION_TYPE);

            transaction.setLabel("queue.task.id", execution.getId());
            transaction.setLabel("queue.task.name", execution.getTaskName());

            return executionHandler.execute(taskInstance, executionContext);
        } catch (final Exception e) {
            transaction.captureException(e);

            throw e;
        } finally {
            transaction.end();
        }
    }

    public static <T extends ObservableTaskContext> ElasticApmExecutionHandler<T> wrap(final ExecutionHandler<T> executionHandler) {
        return new ElasticApmExecutionHandler<>(executionHandler);
    }

    private static class TraceIdHeaderValueFactory {
        /**
         * Builds a trace parent which follows the W3C TraceParent standard.
         * <p>
         * Format: trace-id "-" span-id "-" trace-flags
         *
         * @return the trace parent
         */
        public static String create(final String traceId, final String spanId) {
            return String.format("00-%s-%s-01", traceId, spanId);
        }
    }
}

@kagkarlsson
Copy link
Owner

After some consideration I added an interceptor-concept as well. See PR #504

kagkarlsson added a commit that referenced this issue Jul 23, 2024
Replaces the old `StatsRegistry` and adds more detailed events:

```java
  void onExecutionScheduled(TaskInstanceId taskInstanceId, Instant executionTime);
  void onExecutionStart(CurrentlyExecuting currentlyExecuting);
  void onExecutionComplete(ExecutionComplete executionComplete);
  void onExecutionDead(Execution execution);
  void onExecutionFailedHeartbeat(CurrentlyExecuting currentlyExecuting);
```

Additionally adds `ExecutionInterceptor` as a way of injecting
wrapping-logic for all executions.

```java
  CompletionHandler<?> execute(
      TaskInstance<?> taskInstance, ExecutionContext executionContext, ExecutionChain chain);
```

## Fixes
* #451
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants