Skip to content

Conversation

@jpbrodrick89
Copy link
Contributor

@jpbrodrick89 jpbrodrick89 commented Dec 4, 2025

Relevant issue or PR

Description of changes

This PR just migrates the endpoints INPUTS used in test_examples.py and otherwise leaves tests untouched. These are all stored in the tesseract_api folder in a test_cases folder. The naming convention is {descriptor:camelCase}_{endpoint}_input.json except for the apply endpoint where the convention is just {descriptor:camelCase}_inputs.json.

Follow-up work will come in two stages (@dionhaefner lmk if you'd prefer this to be one PR or a sequence of PRs):

  1. Feed all tests to their respective inputs storing outputs in either a specific format (i.e. json json+base64 or json+binref) or all formats and then test. The tests now effectively become regression tests (we should consider whether that is indeed what we want and also whether pytest-regressions already posesses the functionality we require).
  2. Refactor so that we don't need to manually associate endpoints with input files/Tesseracts. Careful thought is required on how to handle status codes (either status code should be referred directly from descriptor, i.e. bad... -> 422 (my preference) or we also have files containing expected status code (seems a crazy idea)).
  3. Introduce a public tesseract test/regress/test-regress CLI command to allow users to leverage such functionality.

Optional (could come at any stage): Introduce tesseract testgen command to allow users to automate step 2 above.

Critical concern

The default base64 array encoding is not human-readable and may make tests hard to reason about without additional diagnostic efforts. Potential solutions include always using list format in CI tests (less exhaustive than currently) or we have three json files one for each encoding (but then we can't guarantee they stay synced), or we use list but generate other json+... encodings on the fly (probably the best option if we think human readability is crucial).

Testing done

  • CI passes without modifying expected results.

@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.37%. Comparing base (12197b3) to head (51665a2).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #411      +/-   ##
==========================================
- Coverage   76.62%   76.37%   -0.26%     
==========================================
  Files          29       29              
  Lines        3359     3420      +61     
  Branches      525      533       +8     
==========================================
+ Hits         2574     2612      +38     
- Misses        553      576      +23     
  Partials      232      232              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jpbrodrick89 jpbrodrick89 changed the title refactor[test]: move example test inputs to json files test: move example test inputs to json files Dec 4, 2025
@pasteurlabs pasteurlabs locked and limited conversation to collaborators Dec 4, 2025
@jpbrodrick89 jpbrodrick89 reopened this Dec 5, 2025
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we call these empty_input.json or handle in a different way (probably partially defeats point of refactor but more explicit)?

@dionhaefner
Copy link
Contributor

Thanks @jpbrodrick89. This looks like a sensible first step.

Wrt the questions you raise, I wonder if a "test case" structure like this would make sense:

{
  "inputs": { }
  "expected_result": {
     "status_code": 400,
     "outputs": { }
  },
  "endpoint": "apply"
}

Unfortunately you can't use file aliasing with this, but almost: (untested)

$ tesseract run mytess apply | jq '.[inputs]' testcase.json

... and perhaps to be augmented with a dedicated tesseract test command later ...

What do people think?

@jpbrodrick89
Copy link
Contributor Author

Good idea, @dionhaefner, I'm in two minds but I think this is probably the right way to go. There are challenges we need to think carefully about but these probably apply regardless of the json structure.

Advantages of single json

  • Better organised (everything in one file)
  • More maintainable
  • Extensible (we could add additional fields such as assertions or have a mixture of output formats referenced).

Disadvantages of single json

  • Harder/less intuitive to generate regression tests (e.g. tesseract test apply -o example_outputs.json @example_inputs.json needs to be followed up by `jq -n --slurpfile in example_inputs.json --slurpfile out example_outputs.json
    '{inputs: $in[0], expected_result: {status_code: 200, outputs: $out[0]}, endpoint: "apply"}' \

    test_case.json) -> this means introducing tesseract gentest` essentially becomes mandatory.

  • Not consistent with approach in pytest-regressions where all output files are separate from their input.
  • How do we treat check-gradients just have a status code and leave outputs field blank?

General challenges of both approaches

What exactly are we trying to achieve here? Just regression testing or a framework for more general testing of tesseract run behaviour?

Do we want to support "fuzzy tests", e.g. allow for slight floating point differences, pytest-regressions mainly handles this by rounding for data_regression or more customisable atol/rtol for pure numeric outputs (e.g. ndarray_regression which would likely not work for us) we could arguable use data_regression straight out the box to some degree if this suits our purposes.

Do we want to check against all output formats, always do one or make this customisable?

Will we deprecate our "contains" tests in lieu of the new full regressions.

Do we want to reuse the apply inputs automatically in gradient endpoints, then extending them with AD fields?

@dionhaefner
Copy link
Contributor

I care less about the Tesseract Core test suite than providing a general way to test user-built Tesseracts. I'd expect those to live in repos with Makefiles and build scripts, not necessarily full-fledged Python packages covered by pytest. So a simple command line tool ~ tesseract test mytestcase.json seems more useful than adhering to a framework like pytest-regressions. Does that make sense?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants