Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSoC 2025: Comprehensive JSON Schema linting for encouraging best practices and catching anti-patterns early #856

Open
jviotti opened this issue Jan 15, 2025 · 17 comments
Labels
gsoc Google Summer of Code Project Idea

Comments

@jviotti
Copy link
Member

jviotti commented Jan 15, 2025

Brief Description

Writing well-crafted schemas is extremely hard. Not only JSON Schema is a complex schema language and it is easy to shoot yourself in the foot, but as an organisation, we never properly encoded and shared what the best practices and anti-patterns are. As a consequence, our users (including users of API specifications like OpenAPI and AsyncAPI) don't know how to write great schemas, and don't even know what the quality of the schemas they already have is.

I kickstarted some of this work on my open-source tooling (https://github.com/sourcemeta/jsonschema/blob/main/docs/lint.markdown) and would like to take it to the next level.

Expected Outcomes

  • Reach agreement on what the anti-patterns and best practices are across dialects, documenting these in the official JSON Schema organisation as a style guide / advice document in the JSON Schema website. In there, we can publish each rule with a stable URL so any linter out there can link to them when reporting failures, etc saving all of us time from actually explaining them in each tool 😅
  • Extend popular open-source tooling made by TSC members and endorsed by the org (like the Sourcemeta JSON Schema CLI, but potentially others too) to encode the before-mentioned anti-patterns and best practices in a runnable form

Skills Required

  • Strong communication skills, as the first point might involve talking to lots of people in the community and driving discussions to reach agreement.
  • A good understanding of JSON Schema, enough to appreciate and understand what the discovered anti-patterns and best practices would be. Of course, we will mentor a lot in this area
  • A good understanding of the programming language(s) used in the endorsed open-source linting tooling we will extend. The Sourcemeta JSON Schema CLI is written in C++ (it also has a library form that could be then used by other projects)

Mentors

Expected Difficulty

Medium.

Expected Time Commitment

350 hours. We expect most of the time being spent driving discussions to reach agreement. Doing so often takes a lot of time!

@jviotti jviotti added the gsoc Google Summer of Code Project Idea label Jan 15, 2025
@Honyii
Copy link
Contributor

Honyii commented Jan 15, 2025

Great idea Juan, thanks for your submission.

@benjagm
Copy link
Collaborator

benjagm commented Jan 18, 2025

Hi Juan, have you seen the work done by @gregsdennis here:

https://github.com/json-schema-org/json-schema-linting

@jviotti
Copy link
Member Author

jviotti commented Jan 20, 2025

Ah, very nice @benjagm. We should definitely take that as inspiration. Cool stuff. @gregsdennis Is it something you are actively working on?

@gregsdennis
Copy link
Member

gregsdennis commented Jan 20, 2025

No. I started some Spectral stuff a year or so ago, but never got beyond a start. Feel free to overwrite what I've done.

@Relequestual
Copy link
Member

Adding myself as a mentor as discussed with @jviotti =]

@cbum-dev
Copy link

Hi @jviotti and @Relequestual
I’ve been learning JSON Schema from the official docs for the past few weeks and now have a good understanding of it.I’ve also explored various linting methods, including the Sourcemeta JSON Schema Linting Guide and Json Schema Linting, to understand existing approaches.
I’m interested in contributing to this issue for GSoC 2025 if this project gets selected under the guidance of you.
To align with the project's needs, I'd love to get involved in community discussions on various linting methods and best practices. I'm also open to guidance on the most critical anti-patterns to focus on. Looking forward to your thoughts🙂.

@karenetheridge
Copy link
Member

Please also see the prior work at

@saurabhraghuvanshii
Copy link

Hii everyone , so JSON-SCHEMA now officially Gsoc 2025 orgs list , And I wants to work on this Project with respected mentor .
I read necessary document and also read other refer document and resource shared by mentors . looking forward for response .

@GANESHSHARMA1

This comment has been minimized.

@jviotti
Copy link
Member Author

jviotti commented Feb 28, 2025

Here is a proposal for the qualification task for this project (cc @Relequestual we can discuss it here in the open if you have better ideas!)

  • Spend some time of your choosing looking around the JSON Schema Slack, StackOverflow, and other sources where people are asking questions about JSON Schema
  • Based on that data, create a PDF report of the top 10 linting rules you would want to see us standardise and implement, including a brief explanation, an example, and why you picked it

The point here is not to have a comprehensive list, but show us that:

  • You have the ability of digesting a large amount of information (Slack / StackOverflow)
  • You have the ability of transforming diffuse noise information (from i.e. user conversations), and understand what the users were trying to do, and why the recommended solution helped (i.e. either a best practice or an anti pattern)
  • You have enough understanding of JSON Schema to judge whether something mentioned indeed looks like a best practice or not, and why.
  • You have enough understanding of JSON Schema to be able to prioritise which should be the top linting rules in your opinion

@jdesrosiers
Copy link
Member

jdesrosiers commented Mar 3, 2025

(@GANESHSHARMA1, it looks like posted in the wrong issue 😄. If you want to repost in the right place, I'd be happy to respond.)

@Relequestual
Copy link
Member

Here is a proposal for the qualification task for this project (cc @Relequestual we can discuss it here in the open if you have better ideas!) - @jviotti

I think this, plus:

  • Identify if/how to identify 1-5 of these using JSON Schema
  • Propose an approach on how to autofix the same linting violations
  • Give each of the rules a level (error, warn, info), and justify your choice

Too much?

Also, do we want the rule definitions to be JSON based as opposed to code based, and then for there to be an engine to process the rules?
Using JSON Schema may not be faster than pure code, but it will likely still be faster using Blaze, and will allow the rulesets to be interoperable.

@Karan-Palan
Copy link
Contributor

Hello @jviotti @Relequestual , I've been an active contributor in the JSON Schema organisation and also have tried and contributed to Sourcemeta repositories like core and jsonschema (cli wrapper on blaze and core) and created a vs-code extension using jsonschema-cli based on this conversation
I plan on working on my open PRs and adding new features to the cli and updating the extension with better linting as I complete adding --lint to json command after I'm done with my mid-sem exams. I have been learning JSON Schema from the tour and https://www.learnjsonschema.com/. I've started creating the doc with 10 linting rules I think must be present. Could you please give an update on what the final qualification task is?

@jviotti
Copy link
Member Author

jviotti commented Mar 3, 2025

@Relequestual

Also, do we want the rule definitions to be JSON based as opposed to code based, and then for there to be an engine to process the rules?
Using JSON Schema may not be faster than pure code, but it will likely still be faster using Blaze, and will allow the rulesets to be interoperable.

We can try to prototype this during GSoC as a bonus and see where we end up at. In parallel to researching and collecting the actual set of linting rules?

@Hello-Ship-Code
Copy link

Really excited to contribute to this! Can't wait to get started! 🚀

@ShivamBisen
Copy link

@jviotti @Relequestual I am looking forward to work on this project, though I am relatively new to JSON Schema. Could you provide some Source to learn best practices for JSON Schema? is there any existing linting doc available ?

@Karan-Palan
Copy link
Contributor

Hello @ShivamBisen , you can learn JSON Schema from the https://tour.json-schema.org/ and https://www.learnjsonschema.com/2020-12/, after that write some schemas yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gsoc Google Summer of Code Project Idea
Projects
None yet
Development

No branches or pull requests