Skip to content

fix: sanitize pydantic v2 input_value in validation errors#71

Merged
Evrard-Nil merged 3 commits intomainfrom
fix/sanitize-pydantic-v2-errors
Mar 26, 2026
Merged

fix: sanitize pydantic v2 input_value in validation errors#71
Evrard-Nil merged 3 commits intomainfrom
fix/sanitize-pydantic-v2-errors

Conversation

@Evrard-Nil
Copy link
Copy Markdown
Contributor

Summary

  • vLLM returns validation errors in pydantic v2 format (input_value={...}, input_type=dict) which bypassed the existing sanitizer that only checked for Python dict format ('input':, 'ctx':)
  • This could leak user message content in error responses when vLLM rejects malformed requests (e.g. unsupported content part types)
  • Adds detection and sanitization of input_value=/input_type= fields, stripping user data while preserving error description and type
  • Also strips pydantic v2 "For further information visit" URL lines

Test plan

  • 5 new unit tests covering pydantic v2 format, nested dicts, mixed formats, and end-to-end parse_upstream_error
  • All 278 existing tests pass (184 unit + 94 integration)
  • Manually verify with file content part request through cloud-api → inference-proxy → vLLM that input_value is stripped from the error response

🤖 Generated with Claude Code

vLLM returns validation errors in pydantic v2 format which uses
input_value={...} instead of 'input': ... — this bypassed the
existing sanitizer, potentially leaking user message content in
error responses.
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a potential data leakage vulnerability where vLLM's Pydantic v2 validation errors could expose sensitive user data, such as message content, in error responses. The changes extend the existing error sanitization mechanism to specifically detect and strip input_value and input_type fields from Pydantic v2 formatted errors, while preserving essential error descriptions. This enhancement significantly improves data privacy and security by ensuring that user-provided content is not inadvertently returned in error messages.

Highlights

  • Pydantic V2 Error Sanitization: Extended validation error sanitization to include Pydantic v2 formats, specifically input_value and input_type fields, to prevent data leakage.
  • Data Privacy Enhancement: Prevented the leakage of sensitive user message content in error responses from vLLM when malformed requests are rejected.
  • URL Stripping: Implemented stripping of Pydantic v2 "For further information visit" URL lines from error messages.
  • Comprehensive Testing: Added comprehensive unit tests for Pydantic v2 error sanitization, covering various scenarios including nested data and mixed error formats.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the sanitize_validation_errors function to correctly handle and sanitize Pydantic v2 error messages, preventing the leakage of sensitive user data from input_value and input_type fields, and also filters out Pydantic v2 documentation URLs. The review identified a potential data leak in the Pydantic v2 sanitization logic if the error message format is unexpected, and suggested an improvement to refactor repeated string literal checks for better maintainability.

- Fallback to "(validation error)" if desc contains input_value/input_type
  (handles edge case where sensitive data appears before brackets)
- Add test for bracket-less input_value format
- Run cargo fmt
@Evrard-Nil Evrard-Nil merged commit 598e00c into main Mar 26, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant