-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future of schema generators #101
Comments
From @HoldYourWaffle I have created issues for most of the problems I encountered in YousefED/typescript-json-schema. I had to switch to that module because conditional types (particularily Omit) are not supported yet. But there's something else I want to discuss. There are currently 3 modules that accomplish basically the same goal (at least that I know of), and they all have a lot of issues. I see 3 possible solutions for this forkmania (though there could be more): Fix YousefED/typescript-json-schema. This can sortof be seen as the 'default' option, as it would leave the current situation mostly untouched. This isn't my preferred solution because the architecture of vega/ts-json-schema-generator looks a lot better and it doesn't really solve the forkmania. Write a new generator from scratch, using the knowledge and experience from the other three. If we were to do this we'd of course have the amazing power of hindsight and shared knowledge, which would allow us to write a clean, well-designed and future proof generator. However, this would be very time-consuming and since vega/ts-json-schema-generator is already very well designed I don't see much reason to do this. Merge all the good parts into vega/ts-json-schema-generator and 'opening it up' for general usage. I think this would be the best option because it will actually fix the fork issue without consuming a lot of time and effort. I think this module currently has the best/cleanest code and I don't see a reason for using one of the others if we increased flexibility and supported more use cases than what vega is doing. If you're interested I can write up a more detailed overview in a couple of hours. No matter what option you/we choose I'd love to help/fix/develop/maintain. |
Good to see there's interest in my proposal! Perhaps it would be good to pin this issue so more people will see this and hopefully voice their opinion on the matter? |
Thank you for the comments. I will use this issue to explain some of the different philosophies behind the different libraries. The goal of all of these libraries is to convert Typescript to JSON schema. There are different approaches to achieving this and also different interpretations that you can follow. One fundamental issue is that JSON schema and Types are not equivalent. Some things are not expressible in the other. YousefED/typescript-json-schemaThis was the original schema generator that I picked for my main use case, which is Vega-Lite. The philosophy of this library was to be flexible and configurable for different use cases. This means that some configurations work better than others because optimizing all of them is complex. It worked fairly well once I extended it a bit to deal with some of the more complicated types we use. However, I constantly ran into issues with getting meaningful properties that correspond to aliased types. The fundamental problem is that this library uses the type hierarchy, and not the AST to create the JSON schema. Therefore I went on to look for a different library, which was I stopped active development but occasionally review PRs and make releases. I consider this library to be in maintenance mode. xiag-ag/typescript-to-json-schemaThis library was mostly written by @mrix and uses the AST to create the schema. It is also much more modular. Rather than having one big file, there are separate modules for parsing and code generation for all different types of AST nodes and types. I had to extend it significantly to make it work for Vega-Lite. That's what vega/ts-json-schema-generatorThis is the extended version of Overall, this library is much more robust than Even though this library still has some missing functionalities, it works in production for Vega-Lite, which is a complex piece of TypeScript. I keep fixing this library if it is necessary for Vega-Lite but otherwise don't have the cycles to do anything else. I am happy to review PRs if they have sufficient tests, don't introduce options, and provide useful additions (I will reject support for functions because there is no clean mapping to JSON schemas). I consider this library to be in active development and would be more than thrilled to have people help with it. ConclusionI have considered writing a new generator based on the things I have learned. Ideally, it would be a bit leaner than vega/ts-json-schema-generator and not have as many files. However, I don't have the cycles to do it and I'm not convinced that it will be much cleaner. Overall, I think investing resources into Let me know what you think. |
Since your comment is pretty big I'm just going to respond per section.
This is of course true. I think the best way to handle this would be to have a section in the README that clearly lists all constructs that are not or only partially supported. It's annoying to discover something you need is unsupported, but discovering it after you've already started using an automated generator setup is way worse (I can unfortunately speak from experience). Maybe we should also look into a way to manually override generated the generated schema in a fluent way. This way any shortcomings of the library can be manually filled in or corrected. In my own projects I've been using a script that manually changes the generated JSON object but this is very inflexible and error prone. I haven't really thought about how to implement such a mechanism, perhaps a JSdoc annotation with a JSON pointer could be something? I'll think about it.
I'm not sure I understand what you mean by this. Are you trying to say that more options → more complexity → hard to get working correctly? I agree that having more options might increase complexity, but there should always be a sensible default behavior. Having more options to override common sense because it's assumptions doesn't match with your use case is a good thing in my opinion.
I'm not sure I agree with you on this. Having sensible defaults is always a good thing, and making some assumptions when designing something like this is necessary, but I don't see how this would hinder adding options. Could you give an example of an option you've rejected/don't want to add? I'm probably just misunderstanding what you're saying.
I'd love to help you maintain this project if you don't have the time for it! Again I'm not sure why you'd want to reject new options, could you clarify what you mean by this?
This is the main reason why I think a new generator isn't the best option. Is there a reason why we can't make vega/ts-json-schema-generator leaner without rewriting the whole thing? Also, what's wrong with having more files? It makes it a lot easier to find what you're looking for, as well as leaving less room for weird global state anti-patterns.
I also think this is be the best way forward. I don't see a reason why there would be conflicts with Vega, since the goal of a general purpose library is to support most (if not all) usecases, Vega included of course. If we decide to adopt this strategy there's one more issue remaining: uniting the modules/forks to fix the current forkmania. I think YousefED/typescript-json-schema could just be deprecated with a nice forward to this module (as soon as it's ready of course, mainly looking at conditional types). xiag-ag/typescript-to-json-schema is a different story though. I found this PR by you that aims to merge the 2 repositories together, but as you already know there hasn't been any response in 2 years. It seems like @mrix has disappeared from the community, which of course doesn't help our case. The main reason why I want the forks to be united is that the current situation is really confusing. Last week I basically went like this:
Removing YousefED/typescript-json-schema from the equation would help a lot. We may never get a response from @mrix, but we can remove this repository's forked status or add a clear explanation in the README on what the differences are and why you should probably use this module instead of the upstream one. On a completely different note, maybe it's a good idea to create an issue in YousefED/typescript-json-schema referencing this issue and pin it there too. Since that module has more users we're probably going to get more responses then. |
The issue here really is JSON schema and not Typescript. I have found good ways around missing things in typescript such as
More options increase the number of paths through the code and make it harder to get right. I am definitely against adding more code paths just to support another use case. Instead, the defaults should be good. As an example, including or not including aliases or not using a top-level reference have implications on many parts of the code. I can speak from experience that not having the config options avoided a bunch of headaches. I am not against the ability to configure things that have only implications on localized pieces of the code.
I agree and still there we got PRs and issues for it: https://github.com/YousefED/typescript-json-schema/issues?utf8=%E2%9C%93&q=functions+
No. I think the current design is good enough and I don't see any reason to rewrite.
It's already in maintenance mode and I think that's what it should be. A forward link sounds good to me but then I want support with issues that people report ;-) xiag-ag/typescript-to-json-schema has a few features that we should get working in this fork. See #63
Go ahead. I will pin it. I already added a note to https://github.com/YousefED/typescript-json-schema#background. Before we can deprecate the other library, we need support for conditionals here. |
You really did a good job expressing stuff like It should also be noted that without
That makes a lot of sense. So something like
That makes sense, but is there really a reason why someone would want to use the other module once we include all missing feature here? It's always good to keep providing support for something, but is it really worth the effort if this were to be a (practically) drop-in replacement?
I'd love to help, but I can't figure out what new features we're missing since there are so many changes in the vega version. If you can give me some kind of list I'd be more than happy to take a look.
Done. I also created an issue in @XriM's repository in case there are more lost souls like I was. |
I meant intersection types.
Yep. I think my request to keep the number of code paths low is reasonable and I think you agree.
In the future, yes.
|
Of course! I was just wondering where your "line" was on what's too complicated, glad to hear it's in a very reasonable place.
I tried to look through it, but I fear I'm just not well versed enough in the codebase to know what changed, what hasn't been done here already and what is even applicable to our version. Maybe we could copy over the tests that were added, see which ones fail and go from there? I'd love to try it but I'll have to figure out the test infrastructure first, which is of course going to take some time.
The more I use JSON schema the more I think "How have they not solved this yet?". I think from a spec perspective there are 2 logical solutions:
These "ideas" aren't very useful from a generator perspective of course since no validator supports them. The only solution I see is to duplicate & merge the inherited and inheriting schemas but this would probably get really messy really quickly (both in the code and in the output). Maybe adding a |
I merge the objects in the allOf into one big object. |
Are there any issues with this approach apart from messy output? |
If you want to generate code from the schema again, the information about intersections and inheritance is lost. |
That makes sense. Maybe adding some kind of note to the schema could help with that? |
The problem with deprecating I think there could be enormous value in refactoring typescript-json-schema to be more modular, and paring down the list of options to reduce the complexity of the many code paths. There also could be merit to using an AST first approach - i.e. do what we can via traversing the AST, where alias refs will work really well, and only fall back to I would be quite interested in taking on this task (I already maintain typescript-json-validator which is a wrapper around typescript-json-schema), but didn't want to before because I don't want to either:
Having said that, I have also started work on typeconvert, which aims to do type inference on babel ASTs to convert between TypeScript and Flow (and generate documentation, JSON Schema etc.) Unfortunately it's nowhere near ready for release yet though as I keep realising I've made a mistake and need to fundamentally refactor. |
Thank you for your message. I agree that a hybrid approach might work well but it's hard to say until we have a working implementation. For me personally, my concern is that I can generate schemas for Vega-Lite. Maybe you can use that as a test case for another implementation of a hybrid schema generator? |
I think a hybrid approach could definitely be a good solution. I don't see how this would contribute to the 'yet another fork problem', since I think this can just be integrated into one of the existing ones (preferably this one of course)? I don't know much about the internals of either code base though so I might be completely wrong here.
Perhaps it's possible to reuse the logic TypeScript itself is using? Visual Studio Code has (in my experience) near perfect "reflection" on TypeScript code so it should be possible. I think VS Code uses something like this, maybe that's something worth looking into? Again I'm really not qualified to make any well founded argument about this but I try to help as much as I can. |
Maybe we can do something like this: Excuse me if this is way off base, I can't read the whole thread right this minute. |
According to this table mapping types, conditional types or even specific types like Exclude or Omit are not supported in io-ts. So for converting ast to io-ts we still have to do all the complex mapping/condition resolving as we do it now. So nothing gained here in my opinion. And what about annotations? Currently it is very easy to pass them from typescript to the JSON schema. With io-ts in between this will probably be more difficult. I think adding io-ts into the chain just adds more complexity and slows down the project. |
I was about to start try out Typescript to JSON Schema and then I saw this thread. I am now confused which library I should use. Where are we at today and which one do you recommend to use? |
@codler I added a tldr to the first message. Does that help? |
A few reasons I'm thinking about: motivation, ability to chat with other developers, and avoiding duplicated effort. I talk to the maintainer and a few other team members pretty regularly on Discord, so it's easier to discuss design decisions, get feedback, we avoid duplicated effort, and it's more motivating to work on typedoc. It's fun to chat and share progress. Using the compiler API isn't free: it requires understanding the correct way to use it, understanding the quirks, understanding the gotchas to avoid. Typedoc already does that. It uses the typechecker as much as possible except for a few places where it can't. Typedoc handles exports as you would expect, as opposed to this library which exposes internal types, cannot handle multiple types with the same name, and does not handle alias exports. Typedoc needs to accurately document typescript modules, meaning that the extracted type information is more intuitive: it matches the code one-to-one. Typedoc has some extra features which might be useful in the future. It can extract type info to a JSON dump, then use that export to render docs multiple times. It might improve performance being able to extract types once and then render multiple schemas. |
That all sounds fantastic. Do you want to try making a POC? |
I want to ask about this specifically, since it may be a breaking change if we go with typedoc:
This library exposes non-exported types in the schema, using their internal, non-exported names. But I'm not sure why that is necessary, and I'm not sure it is compatible with typedoc's extracted reflections. Do you know if the requirement to expose internals is documented anywhere? Do you know if the identifiers assigned to I'd like to propose a simpler way to tell the schema extractor which schemas to extract: This pushes configuration into the TypeScript language. |
I think it can be nice to use internal aliases to name definitions in the schema if we need them (e.g. when we have a recursive data structure). I don't think we usually expose internal aliases otherwise, do we?
Where would they add a default export? In the source file or in schemas.ts? I do like that we would avoid the challenge of duplicate types and make it very explicit what types get exported. However, it's not how typedoc works and some users may get confused why they don't see some types. Maybe it's okay if people already have the relevant types exported in their My only use case for this library right now is https://github.com/vega/vega-lite and it would be a lot of work to export all the relevant types again. Do you have a suggestion for how I could reduce the necessary work? |
I'm thinking of this flag:
It's tough to tell how that flag is meant to behave, but does it fail if 2x non-exported types in different files have the same identifier? I'm fine with creating shared ref definitions, but the tool should ensure they do not have naming conflicts, and the names do not need to follow any particular rules. They can be made human-readable, but there do not need to be any guarantees about their name. If the user wants a type to have a guaranteed "definition" name, they can achieve that by exporting it from
In |
I think you meant a normal export then, not a default expert (as there can be only one). Right? |
The idea is that the root $ref is determined by the default export, if it exists. Any other named exports will be guaranteed to reside at their name in "definitions". For example:
In the emitted JSON schema: The root "$ref" is guaranteed to refer to
This means that you can trivially modify the root |
Oh, I missed the part about other definitions being included potentially. I see. So definitions that are exported will have a predictable name and any other aliases or types could have arbitrary names (e.g. to include the path). That's a good idea. I think it makes sense overall. |
Hi all, I have been using Zod for input validation in some of my projects. Since, the discussion here seems to be related to producing JSON schema from TS definitions, thought of sharing ts-to-zod and zod-to-json-schema libraries. Combining both, I think we can do |
That's interesting. I wonder how flexible Zod is, though. Does any information get lost in the intermediate representation? |
My immediately thought: does it support customization of the schema with JSDoc
|
Running some tests currently. So far, it seems like it can't handle circular dependencies in type definitions. Getting the following warning while running
|
As per their docs, they support only 6 of the JSDOC keywords. From their docs: // source.ts
export interface HeroContact {
/**
* The email of the hero.
*
* @format email
*/
email: string;
/**
* The name of the hero.
*
* @minLength 2
* @maxLength 50
*/
name: string;
/**
* The phone number of the hero.
*
* @pattern ^([+]?d{1,2}[-s]?|)d{3}[-s]?d{3}[-s]?d{4}$
*/
phoneNumber: string;
/**
* Does the hero has super power?
*
* @default true
*/
hasSuperPower?: boolean;
/**
* The age of the hero
*
* @minimum 0
* @maximum 500
*/
age: number;
}
// output.ts
export const heroContactSchema = z.object({
/**
* The email of the hero.
*
* @format email
*/
email: z.string().email(),
/**
* The name of the hero.
*
* @minLength 2
* @maxLength 50
*/
name: z.string().min(2).max(50),
/**
* The phone number of the hero.
*
* @pattern ^([+]?d{1,2}[-s]?|)d{3}[-s]?d{3}[-s]?d{4}$
*/
phoneNumber: z.string().regex(/^([+]?d{1,2}[-s]?|)d{3}[-s]?d{3}[-s]?d{4}$/),
/**
* Does the hero has super power?
*
* @default true
*/
hasSuperPower: z.boolean().default(true),
/**
* The age of the hero
*
* @minimum 0
* @maximum 500
*/
age: z.number().min(0).max(500),
}); |
Hi first of all thanks for the good work to all the authors of these libraries. Secondly, If there is gonna be a new version with breaking changes, the jsdoc annotations could/should be ditch in favour of decorators. |
Can you explain why? |
@domoritz Also most of the libraries related to data mapping , orm etc, are using decorators now. |
Decorators add runtime metadata to TypeScript and JavaScript classes, but not to other types. This is not a proper analogue for what we want to accomplish here: adding design-time metadata to TypeScript types, including non-classes.
This is not true if the metadata is descriptions and deprecation status. That information is best put in JSDoc; moving it elsewhere removes some of its utility.
This is, I believe, an impractical rule to follow. "Mostly" acknowledges that it is used for more than documentation. Additionally, JSON schemas extracted by this tool serve as documentation. For example, they can be used to generate OpenAPI specifications. In this use-case, the "metadata" is documentation and should be kept in JSDoc where it simultaneously serves other purposes. I can imagine scenarios where decorator metadata and JSDoc metadata are used in tandem, but outlawing the use of JSDoc annotations would be counter-productive. Descriptions should reside in JSDoc so they appear in tooltips and for compatibility with tools such as typedoc. Types must still be annotated using TypeScript syntax. That's a good reason for a schema generator to parse JSDoc comments and type information, even if it combines that information with decorator metadata. If it is valuable for the schema generator to maintain its ability to parse JSDoc and types, then we know that retrieving other JSDoc tags is straightforward. Removing the ability to do that wouldn't simplify the code enough to justify the loss of utility. Note that emitDecoratorMetadata does not emit sufficient information, in case anyone wanted to bring that up. |
I've read this thread and saw that there is still no sufficient solution available to read computed type information. Just wanted to let your know that we released a TypeScript runtime type system that provides via reflection all information you are looking for to generate a JSON schema. More information can be found here: https://deepkit.io/blog/introducing-deepkit-framework. The concrete feature to support this use case is here: https://deepkit.io/documentation/type/reflection |
I think I'd like to see TypeScript - or at least a subset of TypeScript - become a first-class format for the purposes of JSON validation and e.g. comparison against OpenAPI specs. It is the most succinct format for expressing Javascript types that I have come across. The one thing I do like about generating an intermediate schema file, though, is that it produces a single artifact, so only that file needs to be included in a compiled project. Update Just learned that the latest OpenAPI Spec is based on JSON Schema, huh. |
Support Example: https://github.com/mongodb/js-bson/blob/main/src/objectid.ts#L202 EDIT: example, docs |
It's now possible to use only TypeScript types to describe and generate OpenAPI documents and thus JSON Schema, without code generation whatsoever: https://github.com/hanayashiki/deepkit-openapi Code like that: import { MinLength } from '@deepkit/type';
interface User {
id: number;
name: string & MinLength<3>;
password: string;
}
type ReadUser = Omit<User, 'password'>;
type CreateUser = Omit<User, 'id'>; translates to schemas:
ReadUser:
type: object
properties:
id:
type: number
name:
type: string
minLength: 3
required:
- id
- name
CreateUser:
type: object
properties:
name:
type: string
minLength: 3
password:
type: string
required:
- name
- password
User:
type: object
properties:
id:
type: number
name:
type: string
minLength: 3
password:
type: string
required:
- id
- name
- password |
@marcj does that require using the DeepKit framework though? |
@tommedema no |
From the linked project's own README
|
Deepkit Framework is the full thing, like Laravel. But Deepkit is also just a collection of standalone packages you can use separately/standalone. |
This is a discussion about the future of Typescript JSON schema generators.
TL;DR
The text was updated successfully, but these errors were encountered: