Skip to content

Conversation

@mjbonifacio
Copy link

tl;dr - I'm working with @its-hammer-time to the following end goal:

Provide full, consistent information in SchemaValidationFailure when an OpenAPI Schema fails validation whether it originates from the underlying jsonschema library or libopenapi-validator. This will allow clients using libopenapi-validator to offer full visibility into what aspects of an input object failed validation to THEIR clients.

In order to do this, I propose some changes to struct fields on SchemaValidationFailure. Then I will make changes so that each existing validation is consistent with those definitions.

Once that's done, I'd like to merge this into a working branch as the changes will be breaking. Then, I'd like to implement the changes in error mapping/handling in another branch of my fork (and create another PR).

Full details below:

SchemaValidationFailure Struct Refactoring

tl;dr I wanted to get full clarity on how things will change before making all the changes to re-map fields, locations, values, etc.

This PR clarifies the purpose of each struct field on SchemaValidationFailure, and get buy in on that before making a huge batch of changes.

Summary

Refactored SchemaValidationFailure struct to align with JSON Schema specification terminology, improve clarity, and eliminate ambiguity in field naming.

Changes Made

1. Renamed Fields for JSON Schema Alignment

DeepLocationKeywordLocation

  • Before: DeepLocation string - Ambiguous name, unclear what "deep" referred to
  • After: KeywordLocation string - Matches JSON Schema spec terminology
  • Rationale:
    • KeywordLocation is the official JSON Schema term from the jsonschema/v6 library
    • Refers to the path in the schema to the violated keyword (e.g., /properties/email/pattern)
    • Makes it clear this is about the schema rule, not the data
    • Consistent with er.KeywordLocation from jsonschema.OutputUnit

AbsoluteLocationAbsoluteKeywordLocation

  • Before: AbsoluteLocation string - Lost the "Keyword" context
  • After: AbsoluteKeywordLocation string - Full, unambiguous name
  • Rationale:
    • Matches JSON Schema spec terminology exactly
    • Pairs naturally with KeywordLocation (relative vs absolute)
    • Consistent with er.AbsoluteKeywordLocation from jsonschema.OutputUnit
    • Makes it clear this is also a schema location, just with absolute URI

2. Field Organization Improvements

Moved to Better Logical Grouping

// Data instance location (where in the data being validated):
InstancePath []string    // Raw path segments: ["user", "email"]
FieldName string         // Last segment: "email"  
FieldPath string         // JSONPath format: "$.user.email"

// Schema location (where in the schema that failed):
KeywordLocation string          // Relative: "/properties/email/pattern"
AbsoluteKeywordLocation string  // Absolute: "https://..."

This separation makes it crystal clear:

  • InstancePath, FieldPath, FieldNameWhere in the DATA
  • KeywordLocation, AbsoluteKeywordLocationWhere in the SCHEMA

3. Removed Unused Field ReferenceExample

ReferenceExample - DELETED

  • Rationale:
    • Grep search found zero usages across entire codebase
    • Field was never populated or consumed
    • Removing reduces struct size and complexity

4. Improved Documentation

ReferenceSchema Comment

  • Before: "The schema that was referenced in the validation failure"
  • After: "The schema that was referenced in the validation failure"
  • Still clear, accurate description

ReferenceObject Comment

  • Before: "The object that was referenced in the validation failure"
  • After: "The object that failed schema validation"
  • Rationale: More specific - this is the actual failed data, not just "referenced"

5. Prepared Location for deletion

Location - DEPRECATED

  • Status: Moved to bottom of struct, marked as deprecated
  • Deprecation Comment: // DEPRECATED in favor of explicit use of FieldPath & InstancePath
  • Rationale:
    • Was inconsistently set throughout codebase:
      • Sometimes er.InstanceLocation (data location)
      • Sometimes er.KeywordLocation (schema location)
      • Sometimes static strings like "unavailable", "schema compilation", "/required"
    • This ambiguity made it unreliable for consumers
    • Now have explicit fields: use FieldPath/InstancePath for data, KeywordLocation for schema
  • This will be deleted in a followup PR when all usages are by struct fields pertaining to the schema or keyword.(whichever is appropriate)

6. Fixed Comment Typo

Line 99 in ValidationError struct

  • Before: "This is only populated whe the validation type is against a schema"
  • After: "This is only populated when the validation type is against a schema"

Verification

Compilation Check

  • ✅ All usages of DeepLocationKeywordLocation updated
  • ✅ All usages of AbsoluteLocationAbsoluteKeywordLocation updated
  • ✅ No remaining references to old field names
  • ✅ All tests pass

Updated Files

  1. errors/validation_error.go - Struct definitions
  2. schema_validation/validate_schema.go - Sets KeywordLocation and AbsoluteKeywordLocation
  3. schema_validation/validate_document.go - Sets KeywordLocation and AbsoluteKeywordLocation

Backward Compatibility

  • Location field still exists (deprecated but functional)
  • ✅ JSON/YAML tags unchanged for non-renamed fields
  • ✅ New field names have appropriate JSON tags (keywordLocation, absoluteKeywordLocation)

ValidationType/SubType will tell us what part of the HTTP request failed

ValidationError already provides clear HTTP component context via:

  • ValidationType: "parameter", "requestBody", "response"
  • ValidationSubType: "path", "query", "header", "cookie", "schema"

This means consumers can easily determine if an error is from:

  • Path parameters: ValidationType == "parameter" && ValidationSubType == "path"
  • Query parameters: ValidationType == "parameter" && ValidationSubType == "query"
  • Headers: ValidationType == "parameter" && ValidationSubType == "header"
  • Cookies: ValidationType == "parameter" && ValidationSubType == "cookie"
  • Request body: ValidationType == "requestBody"
  • Response body: ValidationType == "response"

The SchemaValidationFailure.Location field was never meant to indicate HTTP component - it was for within-field location.

Future Work (Separate Commit)

1. Add Tests for KeywordLocation Invariant

After the schema refactoring is complete, add tests to validate the following invariant:

  • When OriginalJsonSchemaError is nil, both KeywordLocation and AbsoluteKeywordLocation should be empty strings
  • When OriginalJsonSchemaError is not nil, both fields may be populated (when the error originates from JSON Schema validation)

Rationale: This documents the expected behavior that keyword location fields are only relevant when the validation failure originated from JSON Schema validation, not from other types of validation (e.g., parameter encoding errors, schema compilation errors).

Test scenarios:

func TestSchemaValidationFailure_KeywordLocations_WhenNotFromJsonSchema(t *testing.T) {
    // When OriginalJsonSchemaError is nil, KeywordLocation fields should be empty
}

func TestSchemaValidationFailure_KeywordLocations_WhenFromJsonSchema(t *testing.T) {
    // When OriginalJsonSchemaError is set, KeywordLocation fields may be populated
}

2. Remove Deprecated Location Field

Completely remove the deprecated Location field from SchemaValidationFailure in favor of its more specific counterparts:

  • Use FieldName for the specific field that failed
  • Use FieldPath for the JSONPath representation

Current state:

// DEPRECATED in favor of explicit use of FieldPath & InstancePath
Location string `json:"location,omitempty" yaml:"location,omitempty"`

Actions required:

  1. Remove the Location field from the struct
  2. Update all code that sets Location to use FieldPath or FieldName instead
  3. Update the Error() method to use non-deprecated fields

3. Update SchemaValidationFailure.Error() Method

The Error() method currently uses the deprecated Location field:

// Current (uses deprecated field):
func (s *SchemaValidationFailure) Error() string {
    return fmt.Sprintf("Reason: %s, Location: %s", s.Reason, s.Location)
}

Should be updated to use non-deprecated fields:

// Proposed:
func (s *SchemaValidationFailure) Error() string {
    if s.FieldPath != "" && s.KeywordLocation != "" {
        return fmt.Sprintf("Reason: %s, Field: %s, Keyword: %s", 
            s.Reason, s.FieldPath, s.KeywordLocation)
    }
    if s.FieldPath != "" {
        return fmt.Sprintf("Reason: %s, Field: %s", s.Reason, s.FieldPath)
    }
    if s.KeywordLocation != "" {
        return fmt.Sprintf("Reason: %s, Keyword: %s", s.Reason, s.KeywordLocation)
    }
    return fmt.Sprintf("Reason: %s", s.Reason)
}

Note: This change should be done in conjunction with removing the Location field to ensure all error messages remain informative.

Benefits

  1. Alignment with Standards: Now uses official JSON Schema specification terminology
  2. Clarity: Clear separation between data locations and schema locations
  3. Consistency: Field names match the source (jsonschema.OutputUnit)
  4. Reduced Ambiguity: Deprecated the confusing Location field
  5. Better Documentation: Improved comments make purpose clear
  6. Cleaner Code: Removed unused ReferenceExample field

Breaking Changes

⚠️ This is a breaking change for consumers who use:

  • DeepLocation field (now KeywordLocation)
  • AbsoluteLocation field (now AbsoluteKeywordLocation)

However, this is justified because:

  1. These names were misleading and caused confusion
  2. New names align with industry standard terminology
  3. The change makes the API more intuitive and self-documenting
  4. Consumers should be migrating away from Location anyway (now deprecated)

Related Context

Ongoing Work: Path Parameter Validation Errors

This refactoring is part of a larger effort to ensure all validation errors include comprehensive SchemaValidationErrors. We're systematically reviewing all error paths in ValidatePathParamsWithPathItem to populate schema validation details consistently.

@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.09%. Comparing base (c440fec) to head (5e58242).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #188   +/-   ##
=======================================
  Coverage   97.08%   97.09%           
=======================================
  Files          40       41    +1     
  Lines        4397     4403    +6     
=======================================
+ Hits         4269     4275    +6     
  Misses        128      128           
Flag Coverage Δ
unittests 97.09% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// ReferenceExample is an example object generated from the schema that was referenced in the validation failure.
ReferenceExample string `json:"referenceExample,omitempty" yaml:"referenceExample,omitempty"`
// The original jsonschema.ValidationError object, if the schema failure originated from the jsonschema library.
OriginalJsonSchemaError *jsonschema.ValidationError `json:"-" yaml:"-"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the ReferenceExample?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the name of the OriginalError to OriginalJsonSchemaError? it's the same type, I think just updating the docs is sufficient here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! Sure, let me explain.

The overall goal here is to clarify the responsibility of the SchemaValidationFailure type as much as possible. In this case, I believe that its responsibility will be broadened a bit.

  • Before: A thin wrapper around jsonschema.ValidationError, focused only on jsonschema issues.
  • After: A more comprehensive object that can report on any validation failure, whether it's a JSON Schema constraint or a different, OpenAPI-specific constraint.
  1. On renaming OriginalError to OriginalJsonSchemaError: I made this change to make the field's purpose more explicit. It clarifies that this field will only be populated when the failure is due to a JSON Schema rule. For other OpenAPI-specific validation failures, this field will be nil.

  2. On removing ReferenceExample: I didn't see it referenced anywhere else in the code so I thought it should be removed for clarity -- It was introduced here but it wasn't clear to me if the intended use was ever carried out. Would you mind explaining its intended use by library users? Happy to add it back if that makes sense

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. OK. I can buy that.

  2. I need it.

https://github.com/pb33f/wiretap/blob/main/ui/src/components/violation/violation-details.ts#L165

I have not yet built what I need, but I need that property when I do.

Copy link
Member

@daveshanley daveshanley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need ReferenceExample to remain for downstream purposes, other than that I like the changes.

Copy link
Member

@daveshanley daveshanley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants