[Just for running test]Jessli/convert #43482

YoYoJa · 2025-10-17T20:55:58Z

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

* add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments

…into needuv/structured-results-otel-logging

Copilot

Pull Request Overview

This PR enhances the Azure AI Evaluation SDK with OpenTelemetry integration for logging evaluation results to Application Insights, and updates the prompty evaluation system to return structured data including token usage statistics.

Adds support for logging evaluation results to Application Insights using OpenTelemetry
Updates prompty evaluation to return dictionaries with token usage and metadata instead of raw responses
Includes comprehensive test coverage for the new evaluation result conversion functionality

Reviewed Changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`setup.py`	Adds opentelemetry extra dependency for Application Insights integration
`_model_configurations.py`	Defines new TypedDict structures for evaluation results and App Insights configuration
`_legacy/prompty/_utils.py`	Updates format_llm_response to return structured dictionary with token usage
`_legacy/prompty/_prompty.py`	Changes AsyncPrompty return type to dict for structured response
`_evaluators/_common/_base_prompty_eval.py`	Updates evaluation logic to handle new structured prompty response format
`_evaluate/_evaluate.py`	Adds comprehensive evaluation result conversion and App Insights logging functionality
`_common/rai_service.py`	Enhances content harm response parsing with token usage statistics
`_aoai/*.py`	Updates AOAI grader classes with proper type attributes
`tests/unittests/test_evaluate.py`	Adds comprehensive unit tests for evaluation result conversion
`tests/e2etests/test_prompty_async.py`	Updates tests to handle new structured prompty response format
`tests/e2etests/test_mass_evaluate.py`	Adjusts test assertions for increased evaluation metrics

Copilot · 2025-10-17T20:57:06Z

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_legacy/prompty/_utils.py

+        input_str = f"{json.dumps(inputs)}" if inputs else ""
+        if inputs and len(inputs) > 0:
+            sample_input_json = []
+            msg = ChatCompletionUserMessageParam(
+                role="user",
+                content=input_str,
+            )
+            sample_input_json.append(msg)
+            sample_input = json.dumps(sample_input_json)


The variable sample_input is referenced on line 592 but is only assigned within the if block on lines 569-576. If inputs is empty or None, sample_input will be undefined when accessed, causing a NameError.

Copilot · 2025-10-17T20:57:06Z

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

    elif client_type == "pf_client":
        batch_run_client = ProxyClient(user_agent=UserAgentSingleton().value)
-        # Ensure the absolute path is passed to pf.run, as relative path doesn't work with
+        # Ensure the absolute path is Re to pf.run, as relative path doesn't work with


Corrected incomplete comment from 'Re to pf.run' to 'passed to pf.run'.

Suggested change

# Ensure the absolute path is Re to pf.run, as relative path doesn't work with

# Ensure the absolute path is passed to pf.run, as relative path doesn't work with

Copilot · 2025-10-17T20:57:07Z

sdk/evaluation/azure-ai-evaluation/tests/e2etests/test_prompty_async.py

-        result = await prompty(firstName="Bob", question="What is the capital of France?")
-        assert isinstance(result, ChatCompletion)
-        response: str = result.choices[0].message.content or ""
+        result = await prompty(question="What is the capital of France?", firstName="Barbra", lastName="Streisand")


The function call uses firstName and lastName parameters but line 204 checks for 'Bob' in the response, which doesn't match the provided firstName 'Barbra'. This will cause the assertion to fail.

Copilot · 2025-10-17T20:57:07Z

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

+    results: EvaluationResult,
+    logger: logging.Logger,
+    eval_id: Optional[str] = None,
+    eval_run_id: Optional[str] = None,
+    evaluators: Dict[str, Union[Callable, AzureOpenAIGrader]] = None,
+    eval_run_summary: Optional[Dict[str, Any]] = None,
+    eval_meta_data: Optional[Dict[str, Any]] = None,
+) -> None:


This function is extremely long (500+ lines) and handles multiple complex responsibilities including result parsing, data transformation, and summary calculation. Consider breaking it into smaller, focused functions for better maintainability and testability.

* add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * update * fix UT * fix tests * fix test

…sion/ from app insights config

* add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * Fix column mapping bug for AOAI evaluators with custom data mapping (#43429) * fix nesting bug for custom data mapping * address comments * remove extra code and fix test case * run formatter * use dumps * Modify logic for message body on Microsoft.ApplicationInsights.MessageData to include default message for messages with empty body and export logs (#43091) * Modify logic in PR (#43060) to include default message for messages with empty body and export logs * Update CHANGELOG * Update logic as per updated spec * Addressed comments * Set-VcpkgWriteModeCache -- add token timeout param for cmake generate's that exceed 1 hour (this can happen in C++ API View) (#43470) Co-authored-by: Daniel Jurek <[email protected]> * update * fix UT * fix tests * Added Tests and Samples for Paginated Queries (#43472) * added tests and samples for paginated queries * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * added single partition pagination sample --------- Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> * [Test Proxy] Support AARCH64 platform (#43428) * Delete doc/dev/how_to_request_a_feature_in_sdk.md (#43415) this doc is outdated * fix test * [AutoRelease] t2-iothub-2025-10-03-03336(can only be merged by SDK owner) (#43230) * code and test * update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-redisenterprise-2025-10-17-18412(can only be merged by SDK owner) (#43476) * code and test * update changelog * update changelog * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * Extend basic test for "project_client.agents" to do more operations (#43516) * Sync eng/common directory with azure-sdk-tools for PR 12478 (#43457) * Updated validate pkg template to use packageInfo * Fixed typo * Fixed the right variable to use * output debug log * Fixed errors in expression evaluation * removed debug code * Fixed an issue in pipeline * Updated condition for variable setting step * Join paths of the script path * Use join-path * return from the function rather than exit --------- Co-authored-by: ray chen <[email protected]> * Reorder error and warning log line processing (#43456) Co-authored-by: Wes Haggard <[email protected]> * [App Configuration] - Release 1.7.2 (#43520) * release 1.7.2 * update change log * Modify CODEOWNERS for Azure SDK ownership changes (#43524) Updated CODEOWNERS to reflect new ownership for Azure SDK components. * Migrate Confidential Ledger library from swagger to typespec codegen (#42664) * regen * add default cert endpoint with tsp * remove refs to old namespace * update async operation patch * fix operations patch * fix header impl * more header fixes * revert receipt directory removal * cspell * regen certificates under correct namespace * regen ledger client * update namespace name * revert certificate change * update shared files after regen * updates * delete extra files * cspell * match return type to current behavior * cspell * mypy * pylint * update docs * regen * regen * fix patch * Revert "mypy" This reverts commit 6351ead. * add info in tsp_location.yaml * regen * update patch files * update patch files * fix patch * update patch files * regen * update tsp-location.yaml * generate certificate client * update patch files * fixes * regen clients * update pyproject.toml deps * update assets * regen * revert test change * nit * fix test input * regen with new model * update tests * update tests * apiview props * regen * update tests * update assets * apiview props * temp relative package updates * fix name * fix ledger ci (#43181) * remove swagger * remove extra configs * wip revert package dep temporarily * update readme * fix config files * Revert "wip revert package dep temporarily" This reverts commit db553c4. * move tests * add identity samples --------- Co-authored-by: catalinaperalta <[email protected]> * rm certificate files * update changelog * misc fixes * update shared reqs * test * pylint --------- Co-authored-by: catalinaperalta <[email protected]> * update scripts (#43527) Co-authored-by: helen229 <[email protected]> * [AutoPR azure-mgmt-mongocluster]-generated-from-SDK Generation - Python-5459673 (#43448) * Configurations: 'specification/mongocluster/resource-manager/Microsoft.DocumentDB/MongoCluster/tspconfig.yaml', API Version: 2025-09-01, SDK Release Type: stable, and CommitSHA: 'c5601446fc65494f18157aecbcc79cebcfbab1fb' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5459673 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release. * update changelog --------- Co-authored-by: ChenxiJiang333 <[email protected]> * App Configuration Provider - Key Vault Refresh (#41882) * Sync refresh changes * Key Vault Refresh * adding tests and fixing sync refresh * Updating Async * Fixed Async Tests * Updated tests and change log * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Fixing merge issue * Updating comments * Updating secret refresh * Update _azureappconfigurationproviderasync.py * Fixing Optional Endpoint * fix mypy issue * fixing async test * mixing merge * fixing test after merge * Update testcase.py * Secret Provider Base * removing unused imports * updating exception * updating resolve key vault references * Review comments * fixing tests * tox updates * Updating Tests * Updating Async to be the same as sync * Fixing formatting * fixing tox and unneeded "" * fixing tox items * fix cspell + tests recording * Update test_async_secret_provider.py * Post Merge updates * Move cache to shared code * removed unneeded disabled * Update Secret Provider * Updating usage * Update assets.json * Updated to make secret refresh update dictionary * removing _secret_version_cache * Update assets.json * Update _secret_provider_base.py --------- Co-authored-by: Copilot <[email protected]> * Increment package version after release of azure-appconfiguration (#43531) * Patch `azure-template` back to `green` (#43533) * Update sdk/template/azure-template/pyproject.toml to use `repository` instead of `source` * added brackets for sql query keyword value (#43525) Co-authored-by: Andrew Mathew <[email protected]> * update changelog (#43532) Co-authored-by: catalinaperalta <[email protected]> * App Config Provider - Provider Refactor (#43196) * Code Cleanup * Move validation to shared file * Updating Header Check * Update test_azureappconfigurationproviderbase.py * moved async tests to aio folder * post merge updates --------- Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]>

Neehar Duvvuri and others added 30 commits September 30, 2025 12:36

skeleton code

fa2f8aa

add function that logs eval results to app insights

8dfde5b

format

771e19d

log red team data

7174bc4

add eval result converter

980e2fa

Add result converter

57c73b8

update converter params to optional

1730b17

add eval meta data

3bf93f7

merge converter change to include eval_meta_data

e47639d

fix type

5b198b4

remove useless file

5fbbabe

get eval meta data as input

6ca31a1

fix build errors

ea93d1a

remove useless import

e6a9caa

resolve comments

f24f0e0

update

0abddb0

get agent info from app insights config

f25760c

merge with main

08d3e36

update comments

518b4af

merge otel

74dfddc

merge result converter

15d881e

fix checker failure

5c44f70

rename function

46e0504

Merge branch 'main' of https://github.com/Azure/azure-sdk-for-python …

9cae8cf

…into needuv/structured-results-otel-logging

fix a thing

17cdd1e

add error msg and error code

2ce023e

merge main

654e28f

Surface evaluator error msg

32aad08

surface out error

d1449f5

Neehar Duvvuri and others added 2 commits October 17, 2025 16:43

fix unit tests

de8a43d

merge remote

fa3ae5c

YoYoJa requested a review from a team as a code owner October 17, 2025 20:55

Copilot AI review requested due to automatic review settings October 17, 2025 20:55

github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Oct 17, 2025

Copilot AI reviewed Oct 17, 2025

View reviewed changes

YoYoJa and others added 24 commits October 17, 2025 14:01

fix tests

e386fed

fix prompty tests

b82956d

fix similarity test

340b243

move groundedness to actual prompty impl

8b87275

chore: Update assets.json

ffbbacb

fix test

023bff7

merge remote

244b530

add extra attributes to app insights config, remove agent name/id/ver…

7abf6c3

…sion/ from app insights config

pin otel<1.39.0 since breaking change coming in that version

36e45f6

merge

201af13

merge main

ce3af31

implement scrubber for sensitive information

8388003

run black formatter

12cd578

fix spelling for evaluation sdk

c26a268

use non-deprecated path for emitting traces

37acd37

remove upper bound on otel sdk

691fc3a

shuffle imports

6ac7f0e

merge master and fix bug

ef92791

merge remote

c40feed

merge main

04912f3

merge remote

7450510

fix bug

fe2779d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Just for running test]Jessli/convert #43482

[Just for running test]Jessli/convert #43482

YoYoJa commented Oct 17, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Copilot AI Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	# Ensure the absolute path is Re to pf.run, as relative path doesn't work with
	# Ensure the absolute path is passed to pf.run, as relative path doesn't work with

[Just for running test]Jessli/convert #43482

Are you sure you want to change the base?

[Just for running test]Jessli/convert #43482

Conversation

YoYoJa commented Oct 17, 2025

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants