Skip to content

Conversation

jeastham1993
Copy link

What does this PR do?

Adds an optional configuration option for using span links instead of the automatic parenting to the upstream trace context.

Motivation

Parent-child relationships are not always optimal when building asynchronous systems, OpenTelemetry actually recommedn using span links in most cases. Adding a configurable option allows developers to pick, for each Lambda function, if parenting or linking is the right approach.

Testing Guidelines

Added tests, need advice on best ways to test this with an actual function.

Additional Notes

Types of Changes

  • Bug fix
  • [ X ] New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • [ X ] This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • [ X ] This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • [ X ] This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • [ X ] This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog
  • This PR passes the integration tests (ask a Datadog member to run the tests)

@jeastham1993 jeastham1993 requested review from a team as code owners August 14, 2025 17:16
@nhulston
Copy link
Contributor

nhulston commented Aug 14, 2025

This looks really good! I haven't tested that it works yet, but I'm happy to do so.

Currently, this would only create a span link for the first item in a batch, but span links are ideal for a span with multiple 'parents'. Here's the steps I would take to allow for multiple span links:

  1. [extractor.ts] Update EventTraceExtractor interface and TraceContextExtractor.extract() to return arrays instead of a single trace context
  2. [sqs.ts/sns.ts/event-bridge.ts/anything else I missed] Update extract() to return arrays. If span links are enabled, loop through all event.Records instead of just accessing event.Record[0]. We can limit this to a maximum of, for example, 20 records, so we don't waste time and overwhelm the user with too much information. If span links are not enabled, keep accessing event.Record[0] and just return that as a size-1 array.
  3. [trace-context-service.ts and listener.ts] Handle arrays instead of a single value
  4. [listener.ts][onWrap()] Create span links for each item in the array

Happy to help or hop onto a call!

@jeastham1993
Copy link
Author

@nhulston I've finally got around to adding the additional changes to this PR. I'll fix the merge conflicts, but any chance you could have another look over this now that I've added more robust support for span links?

All the unit tests pass. I'm just deploying stuff to my AWS account to test it works when deployed. I might not get around to doing that before I go on PTO so wanted to leave this for you to have a look at.

}

if (spanContext === null) {
if (spanContexts === null || spanContexts.length === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can remove the null check here


if (spanContext === null) {
if (spanContexts === null || spanContexts.length === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

const extractor = new EventBridgeSQSEventTraceExtractor(tracerWrapper, {useSpanLinks: true} as TraceConfig);

const traceContext = extractor.extract(payload);
expect(traceContext).not.toBeNull();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just add a simple test to check that each item in the traceContext array matches what we expect? so change the second test record to have a different trace ID.

Then check that traceContext[0].traceId == 7379586022458917877 and traceContext[1].traceId == <new trace id>

const traceContext = extractor.extract(payload);
expect(traceContext).not.toBeNull();

expect(traceContext?.length).toBe(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, we should check expect(traceContext?.[0].toTraceId()).toBe("<trace id 1>"); and expect(traceContext?.[1].toTraceId()).toBe("<trace id 2>");

const extractor = new SNSSQSEventTraceExtractor(tracerWrapper, {useSpanLinks: true} as TraceConfig);

const traceContext = extractor.extract(payload);
expect(traceContext.length).toBe(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

tslib "^2.6.2"

"@aws-crypto/[email protected]", "@aws-crypto/sha256-js@^5.2.0":
"@aws-crypto/sha256-js@^5.2.0", "@aws-crypto/[email protected]":
Copy link
Contributor

@nhulston nhulston Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure to revert and not commit this file, unless you are adding a dependency but it looks like no dependency was added

Copy link
Contributor

@nhulston nhulston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Left a couple comments. LMK when you resolve those merge conflicts, and then we can run an e2e test to make sure all the trace context is still propagated properly
https://github.com/DataDog/serverless-e2e-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants