Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Update json_extract to Produce Canonicalized Output #24614

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

duxiao1212
Copy link
Contributor

Description

Canonicalize json_extract output by replacing the copy operation with writing the JSON to output stream with sorted keys

Motivation and Context

This relates to issue #24563.
Ensuring JSON output canonicalization guarantees the accuracy of JSON comparisons and aligns the behavior with json_parse

Impact

low impact

Test Plan

Unit test is self-explanatory

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== NO RELEASE NOTE ==

@duxiao1212 duxiao1212 requested a review from a team as a code owner February 24, 2025 03:21
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Feb 24, 2025
@shangm2
Copy link
Contributor

shangm2 commented Feb 24, 2025

LGTM. Maybe also add a verifier run?

@duxiao1212 duxiao1212 force-pushed the master branch 3 times, most recently from 29fb500 to d7fc7ce Compare February 24, 2025 15:04
}

@Test
public void testParseNullIfJsonInvalid()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for the expected exception here instead of checking for null you can use assertInvalidFunction() once you change to throwing an invalid_function_argument error code.

SORTED_MAPPER.writeValue((OutputStream) dynamicSliceOutput, SORTED_MAPPER.readValue(jsonParser, Object.class));
// nextToken() returns null if the input is parsed correctly,
// but will throw an exception if there are trailing characters.
jsonParser.nextToken();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catch the exception and throw a PrestoException with an invalid_function_argument error code. include the failing jsonInput in the error message. Also, when there is no exception, test that the value is null (non-null and no exception means there's another valid json object after this one) and throw a PrestoException with invalid_function_argument for that case too.

@rschlussel
Copy link
Contributor

Also let's gate this change behind a configuration/session property. You can look at usages of other properties in SqlFunctionProperties for examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:Meta PR from Meta
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants