Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(instrumentation-aws-sdk): add bedrock extension to apply gen ai conventions #2700

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Feb 6, 2025

Which problem is this PR solving?

  • AWS SDK spans for bedrock should be gen AI spans but are currently generic SDK spans

Short description of the changes

  • Add a service extension for bedrock, applying gen AI conventions
    • This initial PR is to get the general infrastructure and setup so applies minimally, only Converse with span attributes. Future PRs will add other bedrock APIs and other gen AI conventions such as events and metrics
  • Update AWS SDK deps to sync up with the now newer required SDK version

/cc @trentm @codefromthecrypt

@anuraaga anuraaga requested a review from a team as a code owner February 6, 2025 04:10
@github-actions github-actions bot requested a review from blumamir February 6, 2025 04:11
@github-actions github-actions bot requested review from jj22ee and trivikr February 6, 2025 04:11
@anuraaga anuraaga changed the title Aws bedrock extension feat(instrumentation-aws-sdk): add bedrock extension to apply gen ai conventions Feb 6, 2025
@anuraaga
Copy link
Contributor Author

anuraaga commented Feb 6, 2025

Hi @AmanAgarwal041 - just wanted to share this PR adding gen ai instrumentation of bedrock. Notably, for tests it takes a pattern of recording real LLM responses using nock back, which I see most gen ai instrumentation outside of opentelemetry taking due to the complexity of the models. It may be a good approach for #2402 (if you need help with that PR, let me know). Cheers.

Copy link

codecov bot commented Feb 6, 2025

Codecov Report

Attention: Patch coverage is 96.55172% with 2 lines in your changes missing coverage. Please review.

Project coverage is 91.05%. Comparing base (92106ff) to head (c33aaaa).

Files with missing lines Patch % Lines
...entelemetry-instrumentation-aws-sdk/src/aws-sdk.ts 0.00% 1 Missing ⚠️
...umentation-aws-sdk/src/services/bedrock-runtime.ts 97.67% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2700      +/-   ##
==========================================
+ Coverage   90.96%   91.05%   +0.09%     
==========================================
  Files         171      168       -3     
  Lines        8133     7880     -253     
  Branches     1649     1623      -26     
==========================================
- Hits         7398     7175     -223     
+ Misses        735      705      -30     
Files with missing lines Coverage Δ
...entelemetry-instrumentation-aws-sdk/src/semconv.ts 100.00% <100.00%> (ø)
...ntation-aws-sdk/src/services/ServicesExtensions.ts 96.55% <100.00%> (+0.25%) ⬆️
...entelemetry-instrumentation-aws-sdk/src/aws-sdk.ts 94.67% <0.00%> (-0.39%) ⬇️
...umentation-aws-sdk/src/services/bedrock-runtime.ts 97.67% <97.67%> (ø)

... and 8 files with indirect coverage changes

@anuraaga
Copy link
Contributor Author

anuraaga commented Feb 6, 2025

@trentm Oh I remembered now that the latest version of the AWS SDK doesn't support Node 14, which is causing the unit tests to fail after anuraaga#1 (comment). Let me know what's a good way forward, I think either dropping Node 14 support for this package (not sure it's allowed) or downgrading again.

Copy link

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly concise change to add the toehold here. I know there's more to do, but good work!

@jj22ee
Copy link
Contributor

jj22ee commented Feb 7, 2025

Hi @anuraaga, thanks for this. Just to let you know, the ADOT team from AWS also had vested interest in Gen AI support in AWS SDK instrumentation, and planning to contribute in the future as well.

There are Bedrock Service Extension implementations in multiple languages in the ADOT's downstream of the auto-instrumentations (JS, Python), which are similar but might not be fully 1:1 with your changes. I was wondering if you could also consider ADOT's implementations in this PR? I'll also spend time to compare the 2 implementations for any subtle differences.

@codefromthecrypt
Copy link

@jj22ee drive-by comment, but one thing to give heads up about is that the toehold here is based on the latest genai semantic conventions, and I don't expect this to want to do too much in one PR. So, one thing to think about is what happens in this PR vs a follow-up to add more features or things that are defined beyond the otel specs. Doing it like that might result in the same or similar end, but faster vs trying to do too much in the first of many PRs. food for thought!

https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-spans.md

@jj22ee
Copy link
Contributor

jj22ee commented Feb 7, 2025

Thanks for the heads up @codefromthecrypt. I agree to start smaller. Just wanted to ensure there are no conflicts, but the latest Gen AI conventions will take priority, so I have no issue with that.

Comment on lines 100 to 106
const { inputTokens, outputTokens } = usage;
if (inputTokens !== undefined) {
span.setAttribute(ATTR_GEN_AI_USAGE_INPUT_TOKENS, inputTokens);
}
if (outputTokens !== undefined) {
span.setAttribute(ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, outputTokens);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case usage is undefined:

Suggested change
const { inputTokens, outputTokens } = usage;
if (inputTokens !== undefined) {
span.setAttribute(ATTR_GEN_AI_USAGE_INPUT_TOKENS, inputTokens);
}
if (outputTokens !== undefined) {
span.setAttribute(ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, outputTokens);
}
if (usage) {
const { inputTokens, outputTokens } = usage;
if (inputTokens !== undefined) {
span.setAttribute(ATTR_GEN_AI_USAGE_INPUT_TOKENS, inputTokens);
}
if (outputTokens !== undefined) {
span.setAttribute(ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, outputTokens);
}
}

* limitations under the License.
*/

// Gen AI conventions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also help copy over comments for the copied ATTR_* attribute keys from the semantic-conventions?

Example from a resource-detector package

@karthikscale3
Copy link

nice work! looks good for the most part.

@anuraaga
Copy link
Contributor Author

anuraaga commented Feb 10, 2025

Thanks @jj22ee - I think the implementation pattern itself is similar/same as almost all the work is done by the service extension anyways. Technically there isn't any overlap yet since AFAIU, ADOT currently only instruments InvokeModel while I started with Converse here, though when adding InvokeModel support here I think it will be very similar to ADOT. Initially, this wasn't obvious so I reworked the implementation to make it clearer for this PR it is Converse-only, while hopefully making the extension to InvokeModel obvious. Skimming through the attributes, I didn't see any conflicts in semconv either yet (it's still gen_ai.system, for now...)

The bigger difference is the test infrastructure using nock-back, which we also adopted in upstream python

https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/instrumentation/opentelemetry-instrumentation-botocore/tests/conftest.py#L71

so I think if we get this in for the initial infrastructure we'll be able to smoothly add more features over subsequent PRs by only focusing on business logic instead of build infra.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants