feat(instrumentation-aws-sdk): add bedrock extension to apply gen ai conventions #2700

anuraaga · 2025-02-06T04:10:55Z

Which problem is this PR solving?

AWS SDK spans for bedrock should be gen AI spans but are currently generic SDK spans

Short description of the changes

Add a service extension for bedrock, applying gen AI conventions
- This initial PR is to get the general infrastructure and setup so applies minimally, only Converse with span attributes. Future PRs will add other bedrock APIs and other gen AI conventions such as events and metrics
Update AWS SDK deps to sync up with the now newer required SDK version

…y-js-contrib into aws-bedrock-extension

anuraaga · 2025-02-06T04:16:55Z

Hi @AmanAgarwal041 - just wanted to share this PR adding gen ai instrumentation of bedrock. Notably, for tests it takes a pattern of recording real LLM responses using nock back, which I see most gen ai instrumentation outside of opentelemetry taking due to the complexity of the models. It may be a good approach for #2402 (if you need help with that PR, let me know). Cheers.

codecov · 2025-02-06T04:22:37Z

Codecov Report

Attention: Patch coverage is 96.55172% with 2 lines in your changes missing coverage. Please review.

Project coverage is 90.55%. Comparing base (7f48564) to head (c220322).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...entelemetry-instrumentation-aws-sdk/src/aws-sdk.ts	0.00%	1 Missing ⚠️
...umentation-aws-sdk/src/services/bedrock-runtime.ts	97.67%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2700      +/-   ##
==========================================
+ Coverage   90.51%   90.55%   +0.04%     
==========================================
  Files         166      168       +2     
  Lines        8020     8078      +58     
  Branches     1533     1548      +15     
==========================================
+ Hits         7259     7315      +56     
- Misses        761      763       +2

Files with missing lines	Coverage Δ
...entelemetry-instrumentation-aws-sdk/src/semconv.ts	`100.00% <100.00%> (ø)`
...ntation-aws-sdk/src/services/ServicesExtensions.ts	`96.55% <100.00%> (+0.25%)`	⬆️
...entelemetry-instrumentation-aws-sdk/src/aws-sdk.ts	`92.54% <0.00%> (-0.58%)`	⬇️
...umentation-aws-sdk/src/services/bedrock-runtime.ts	`97.67% <97.67%> (ø)`

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

anuraaga · 2025-02-06T04:25:37Z

@trentm Oh I remembered now that the latest version of the AWS SDK doesn't support Node 14, which is causing the unit tests to fail after anuraaga#1 (comment). Let me know what's a good way forward, I think either dropping Node 14 support for this package (not sure it's allowed) or downgrading again.

codefromthecrypt

Surprisingly concise change to add the toehold here. I know there's more to do, but good work!

jj22ee · 2025-02-07T20:03:54Z

Hi @anuraaga, thanks for this. Just to let you know, the ADOT team from AWS also had vested interest in Gen AI support in AWS SDK instrumentation, and planning to contribute in the future as well.

There are Bedrock Service Extension implementations in multiple languages in the ADOT's downstream of the auto-instrumentations (JS, Python), which are similar but might not be fully 1:1 with your changes. I was wondering if you could also consider ADOT's implementations in this PR? I'll also spend time to compare the 2 implementations for any subtle differences.

codefromthecrypt · 2025-02-07T22:07:11Z

@jj22ee drive-by comment, but one thing to give heads up about is that the toehold here is based on the latest genai semantic conventions, and I don't expect this to want to do too much in one PR. So, one thing to think about is what happens in this PR vs a follow-up to add more features or things that are defined beyond the otel specs. Doing it like that might result in the same or similar end, but faster vs trying to do too much in the first of many PRs. food for thought!

https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-spans.md

jj22ee · 2025-02-07T22:21:46Z

Thanks for the heads up @codefromthecrypt. I agree to start smaller. Just wanted to ensure there are no conflicts, but the latest Gen AI conventions will take priority, so I have no issue with that.

jj22ee · 2025-02-08T00:07:48Z

plugins/node/opentelemetry-instrumentation-aws-sdk/src/services/bedrock-runtime.ts

+    const { inputTokens, outputTokens } = usage;
+    if (inputTokens !== undefined) {
+      span.setAttribute(ATTR_GEN_AI_USAGE_INPUT_TOKENS, inputTokens);
+    }
+    if (outputTokens !== undefined) {
+      span.setAttribute(ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, outputTokens);
+    }


In case usage is undefined:

Suggested change

const { inputTokens, outputTokens } = usage;

if (inputTokens !== undefined) {

span.setAttribute(ATTR_GEN_AI_USAGE_INPUT_TOKENS, inputTokens);

}

if (outputTokens !== undefined) {

span.setAttribute(ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, outputTokens);

}

if (usage) {

const { inputTokens, outputTokens } = usage;

if (inputTokens !== undefined) {

span.setAttribute(ATTR_GEN_AI_USAGE_INPUT_TOKENS, inputTokens);

}

if (outputTokens !== undefined) {

span.setAttribute(ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, outputTokens);

}

}

jj22ee · 2025-02-08T00:19:15Z

plugins/node/opentelemetry-instrumentation-aws-sdk/src/semconv.ts

+ * limitations under the License.
+ */
+
+// Gen AI conventions


Can you also help copy over comments for the copied ATTR_* attribute keys from the semantic-conventions?

Example from a resource-detector package

I have a commit that re-generates the src/semconv.ts using a coming script for this (from #2669), if you are okay with me pushing to your branch, @anuraaga.
It also updates the semconv dep to ^1.29.0 because you are using semconv attributes defined in that version (GEN_AI_SYSTEM_VALUE_AWS_BEDROCK) -- though technically because a copy (in src/semconv.ts) is being used, there isn't really the runtime dep.

Thanks @trentm - feel free to push anything you need to this branch

karthikscale3 · 2025-02-09T05:23:32Z

nice work! looks good for the most part.

…y-js-contrib into aws-bedrock-extension

anuraaga · 2025-02-10T01:46:10Z

Thanks @jj22ee - I think the implementation pattern itself is similar/same as almost all the work is done by the service extension anyways. Technically there isn't any overlap yet since AFAIU, ADOT currently only instruments InvokeModel while I started with Converse here, though when adding InvokeModel support here I think it will be very similar to ADOT. Initially, this wasn't obvious so I reworked the implementation to make it clearer for this PR it is Converse-only, while hopefully making the extension to InvokeModel obvious. Skimming through the attributes, I didn't see any conflicts in semconv either yet (it's still gen_ai.system, for now...)

The bigger difference is the test infrastructure using nock-back, which we also adopted in upstream python

https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/instrumentation/opentelemetry-instrumentation-botocore/tests/conftest.py#L71

so I think if we get this in for the initial infrastructure we'll be able to smoothly add more features over subsequent PRs by only focusing on business logic instead of build infra.

jj22ee · 2025-02-11T02:52:36Z

Thank you, the changes lgtm!

As for the suddenly failing unit tests, the latest commit shows 15 tests failing, but I'm pretty sure 14 of them are unrelated to your changes. I suppose there is an issue with the recently added "does not currently add genai conventions" test which is causing the other 14 to fail. Likely InvokeModel isn't intercepted by nock-back?

I recall I had a similar issue when using nock for Kinesis Client, but needed to enforce NodeHttpHandler requestHandler since I found that the default handler was HTTP2 which was not supported by nock. Not sure if this issue is the same.

anuraaga · 2025-02-11T04:57:55Z

Sorry @jj22ee I missed the test failures behind the known node 14 ones I need advice on. Just forgot to git add the recording will do it soon

trentm · 2025-02-14T04:58:46Z

@anuraaga I have an idea for dealing with the node v14 test breakage. It may end up looking a little gross. I still have to write it up, and I may break it into a separate PR: a PR that (a) updates the existing @aws-sdk/client-* deps to recent versions; (b) changes the npm test for this package to just skip running with Node.js v14 and v16; and (c) uses the "test-all-versions" tests (i.e. .tav.yml) to handle running tests with node 14 and 16.

trentm · 2025-02-19T01:09:30Z

I have an idea

#2722 and #2723 for this.

Assuming these get in, I can resync this PR to main and add my semconv updates that I mentioned above at #2700 (comment)

anuraaga · 2025-03-04T10:46:57Z

I noticed open-telemetry/opentelemetry-js#5521 bumping min node.

Does it mean we may just wait for this repo to upgrade to the new sdk, drop the workflows for old CI, and go from there?

…y-js-contrib into aws-bedrock-extension

anuraaga · 2025-03-19T02:08:25Z

@trentm I have merged main, added bedrock to tav, removed the node 14 and 16 blocks from existing TAV since I guess they're not actually used anymore, and build is green. Hopefully we're ready to go!

jj22ee

Adding a note here, in the final followup PR, the README can be updated here and here for all the new bedrock service extensions and attributes. I also haven't updated it yet with the s3/kinesis extensions...

jj22ee · 2025-03-19T02:47:33Z

plugins/node/opentelemetry-instrumentation-aws-sdk/.tav.yml

@@ -9,30 +9,27 @@
 # - 14.x dropped in v3.567.0 https://github.com/aws/aws-sdk-js-v3/pull/6034
 # - 16.x dropped in v3.723.0 https://github.com/aws/aws-sdk-js-v3/pull/6775

-"@aws-sdk/client-s3":
+"@aws-sdk/client-bedrock-runtime":
  env:
    - SKIP_TEST_IF_DISABLE=true
  # - 3.529.0 was missing the fast-xml-parser dependency (https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.529.1)


Can remove this comment since the bedrock-runtime tests are only for ^3.587.0

…exports by name)

… is no longer necessary With JS SDK 2.0 the min supported Node.js is v18, which suffices for the latest instrumented deps currently being used.

…rrently used I expect to use it in instr-fastify, and leaving it in here means a little less churn in the repo.

… gen ai conventions (open-telemetry#2700) Co-authored-by: Trent Mick <[email protected]>

anuraaga added 6 commits January 30, 2025 15:55

feat(instrumentation-aws-sdk): add bedrock extension

6cb00fa

Use minimum bedrock that supports Converse

765c280

Cleanups

94d7fe4

Update deps

36ca8fd

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

fc8fd69

…y-js-contrib into aws-bedrock-extension

git add

5bf7306

anuraaga requested a review from a team as a code owner February 6, 2025 04:10

github-actions bot assigned blumamir, jj22ee and trivikr Feb 6, 2025

github-actions bot requested a review from blumamir February 6, 2025 04:11

github-actions bot added the pkg:instrumentation-aws-sdk label Feb 6, 2025

github-actions bot requested review from jj22ee and trivikr February 6, 2025 04:11

anuraaga changed the title ~~Aws bedrock extension~~ feat(instrumentation-aws-sdk): add bedrock extension to apply gen ai conventions Feb 6, 2025

codefromthecrypt approved these changes Feb 6, 2025

View reviewed changes

jj22ee reviewed Feb 8, 2025

View reviewed changes

anuraaga added 4 commits February 10, 2025 10:08

Cleanup

e2366aa

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

018cb5c

…y-js-contrib into aws-bedrock-extension

Extract converse handler

b8e3c16

Add pre-implementation test case of InvokeModel

c33aaaa

git add

f768a7d

trentm mentioned this pull request Feb 19, 2025

discussion: how to npm test with modern versions of instrumented packages? #2722

Closed

anuraaga mentioned this pull request Feb 20, 2025

Add instrumentation of AWS Bedrock to use gen_ai conventions open-telemetry/opentelemetry-java-instrumentation#13355

Merged

anuraaga added 4 commits March 19, 2025 10:26

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

114211d

…y-js-contrib into aws-bedrock-extension

TAV

52900ed

Cleanup

ab76ddb

Setup fixed credentials for dryrun mode

ae73969

jj22ee approved these changes Mar 19, 2025

View reviewed changes

anuraaga and others added 6 commits March 19, 2025 12:00

Cleanup

fc9a99f

regenerate src/semconv.ts with .../scripts/gen-semconv-ts.js (sorted …

8aaf08c

…exports by name)

briefly mention the custom bedrock-runtime instr in the README

bde108c

drop the skip-test-if support added in open-telemetry#2723 because it…

fa12583

… is no longer necessary With JS SDK 2.0 the min supported Node.js is v18, which suffices for the latest instrumented deps currently being used.

let's *not* remove the skip-test-if.js script even though it isn't cu…

c1fc1f6

…rrently used I expect to use it in instr-fastify, and leaving it in here means a little less churn in the repo.

Merge branch 'main' into aws-bedrock-extension

c220322

trentm approved these changes Mar 19, 2025

View reviewed changes

trentm merged commit 2b7feac into open-telemetry:main Mar 19, 2025
23 checks passed

dyladan mentioned this pull request Mar 19, 2025

chore: release main #2760

Merged

trivikr mentioned this pull request Mar 20, 2025

test(aws-sdk): remove runs for Node.js <18.x #2763

Closed

deejay1 pushed a commit to deejay1/opentelemetry-js-contrib that referenced this pull request Apr 14, 2025

feat(instrumentation-aws-sdk): add bedrock-runtime extension to apply…

4ad61a6

… gen ai conventions (open-telemetry#2700) Co-authored-by: Trent Mick <[email protected]>

feat(instrumentation-aws-sdk): add bedrock extension to apply gen ai conventions #2700

feat(instrumentation-aws-sdk): add bedrock extension to apply gen ai conventions #2700

Uh oh!

Conversation

anuraaga commented Feb 6, 2025

Which problem is this PR solving?

Short description of the changes

Uh oh!

anuraaga commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

anuraaga commented Feb 6, 2025

Uh oh!

codefromthecrypt left a comment

Choose a reason for hiding this comment

Uh oh!

jj22ee commented Feb 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codefromthecrypt commented Feb 7, 2025

Uh oh!

jj22ee commented Feb 7, 2025

Uh oh!

jj22ee Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

jj22ee Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

trentm Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

anuraaga Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

karthikscale3 commented Feb 9, 2025

Uh oh!

anuraaga commented Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jj22ee commented Feb 11, 2025

Uh oh!

anuraaga commented Feb 11, 2025

Uh oh!

trentm commented Feb 14, 2025

Uh oh!

trentm commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anuraaga commented Mar 4, 2025

Uh oh!

anuraaga commented Mar 19, 2025

Uh oh!

jj22ee left a comment

Choose a reason for hiding this comment

Uh oh!

jj22ee Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

anuraaga commented Feb 6, 2025 •

edited

Loading

codecov bot commented Feb 6, 2025 •

edited

Loading

jj22ee commented Feb 7, 2025 •

edited

Loading

anuraaga commented Feb 10, 2025 •

edited

Loading

trentm commented Feb 19, 2025 •

edited

Loading