Foundation Model Serve Health Check #4638

utkshukla · 2025-11-24T16:55:06Z

Foundation Model Serve Health Check

github-actions · 2025-11-24T16:55:46Z

Test Results for assets-test

0 tests 0 ✅ 0s ⏱️
0 suites 0 💤
0 files 0 ❌

Results for commit 1c6ce13.

♻️ This comment has been updated with latest results.

..._management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py

+    except Exception as e:
+        logger.error(f"Downstream engine health check failed: {str(e)}")
+        return JSONResponse(
+            content={"status": "unhealthy", "downstream": "error", "error": str(e)},


To fix this problem, we should avoid sending the exception's string representation (str(e)) in responses returned to the user. Instead, log the exception on the server using the configured logger and return a generic error message to the client, e.g., “internal error.” Only minimal, non-sensitive information should be exposed in API responses. This change can be accomplished by editing the except Exception as e: block in the /health endpoint. The logger.error() call can still record the full details of the exception for server-side troubleshooting. No new methods or imports are needed, as appropriate logging is already in use.

Specifically:

Remove the "error": str(e) entry from the JSON response returned to the client on line 400.

Continue logging the error server-side as is.

Only assets/training/model_management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py needs to be edited, within lines 397-401.

..._management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py

github-actions · 2025-11-24T17:00:30Z

Test Results for training-model-mgmt-unittests

45 tests 45 ✅ 11s ⏱️
1 suites 0 💤
1 files 0 ❌

Results for commit 1c6ce13.

♻️ This comment has been updated with latest results.

rohitharkhani · 2025-11-25T03:55:10Z

..._management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py

+                     503 Service Unavailable if downstream is not ready.
+    """
+    try:
+        ready_path = os.getenv(EnvironmentVariables.CONTAINER_READY_CHECK_PATH, "/ready")


It will be better to abstract this code more reusable because readiness probe and heacth check seems to be similar.

..._management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py

github-actions · 2025-11-25T04:03:14Z

Test Results for scripts-test

79 tests 79 ✅ 8m 48s ⏱️
1 suites 0 💤
1 files 0 ❌

Results for commit 1c6ce13.

♻️ This comment has been updated with latest results.

..._management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py

+    except Exception as e:
+        logger.error(f"Downstream engine readiness check failed: {str(e)}")
+        return JSONResponse(
+            content={"status": "not_ready", "downstream": "error", "error": str(e)},


To resolve this issue, the exception handling in the ready endpoint should avoid returning the raw error message (str(e)) to the client. Instead, log the exception and return a generic error response. This ensures that sensitive internal details are kept only in server logs, not exposed via the API response. Specifically, for lines 449–451, change the code so that logger.error logs the full exception and stack trace for developer use, but the JSON returned to the client only includes a fixed, non-detailed "internal error" message. Required edits include:

Use logger.error with exc_info=True to capture stack trace.

Remove the "error": str(e) field from the response; replace with a generic "error": "internal_error" or similar.
No additional imports or methods are needed beyond what is already present.

utkshukla added 19 commits May 29, 2025 12:31

4021541

15458f2

Merge branch 'main' of https://github.com/Azure/azureml-assets

f5cb68c

Merge branch 'main' of https://github.com/Azure/azureml-assets

126146e

Merge branch 'main' of https://github.com/Azure/azureml-assets

2c7905b

Merge branch 'main' of https://github.com/Azure/azureml-assets

b8f7234

Merge branch 'main' of https://github.com/Azure/azureml-assets

a369ded

Merge branch 'main' of https://github.com/Azure/azureml-assets

f407b98

Merge branch 'main' of https://github.com/Azure/azureml-assets

e48bb7c

Merge branch 'main' of https://github.com/Azure/azureml-assets

5a004fa

Merge branch 'main' of https://github.com/Azure/azureml-assets

7e7153e

Merge branch 'main' of https://github.com/Azure/azureml-assets

49ba82b

Merge branch 'main' of https://github.com/Azure/azureml-assets

6ff77a4

4736558

1806b16

Merge branch 'main' of https://github.com/Azure/azureml-assets

3c2327e

Merge branch 'main' of https://github.com/Azure/azureml-assets

493ddaa

Merge branch 'main' of https://github.com/Azure/azureml-assets

6f158b1

4736558

8094ae3

4736558

d610715

4736558

19f77e2

utkshukla requested a review from a team as a code owner November 24, 2025 16:55

utkshukla temporarily deployed to Testing November 24, 2025 16:55 — with GitHub Actions Inactive

github-advanced-security bot found potential problems Nov 24, 2025

View reviewed changes

4736558

7e5c12b

utkshukla temporarily deployed to Testing November 25, 2025 03:50 — with GitHub Actions Inactive

rohitharkhani reviewed Nov 25, 2025

View reviewed changes

..._management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py Outdated Show resolved Hide resolved

pabhatia-ms reviewed Nov 25, 2025

View reviewed changes

..._management/environments/foundation-model-serve/context/foundation/model/serve/api_server.py Show resolved Hide resolved

4736558

1c6ce13

utkshukla temporarily deployed to Testing November 26, 2025 04:48 — with GitHub Actions Inactive

utkshukla temporarily deployed to Testing November 26, 2025 04:49 — with GitHub Actions Inactive

github-advanced-security bot found potential problems Nov 26, 2025

View reviewed changes

@@ -446,9 +446,9 @@
                         status_code=503
                     )
                 except Exception as e:
-                    logger.error(f"Downstream engine readiness check failed: {str(e)}")
+                    logger.error(f"Downstream engine readiness check failed: {str(e)}", exc_info=True)
                     return JSONResponse(
-                        content={"status": "not_ready", "downstream": "error", "error": str(e)},
+                        content={"status": "not_ready", "downstream": "error", "error": "internal_error"},
                         status_code=503
                     )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Foundation Model Serve Health Check #4638

Foundation Model Serve Health Check #4638

Uh oh!

utkshukla commented Nov 24, 2025

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

Check warning

Copilot Autofix

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

rohitharkhani Nov 25, 2025

Uh oh!

Uh oh!

github-actions bot commented Nov 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Check warning

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Foundation Model Serve Health Check #4638

Are you sure you want to change the base?

Foundation Model Serve Health Check #4638

Uh oh!

Conversation

utkshukla commented Nov 24, 2025

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results for assets-test

Uh oh!

Check warning

Uh oh!

Copilot Autofix

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results for training-model-mgmt-unittests

Uh oh!

rohitharkhani Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results for scripts-test

Uh oh!

Uh oh!

Check warning

Uh oh!

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions bot commented Nov 24, 2025 •

edited

Loading

github-actions bot commented Nov 24, 2025 •

edited

Loading

github-actions bot commented Nov 25, 2025 •

edited

Loading