Skip to content

support recursive & mutually recursive types#54

Open
SebastienGllmt wants to merge 5 commits intoxddq:mainfrom
SebastienGllmt:mutually-recursive
Open

support recursive & mutually recursive types#54
SebastienGllmt wants to merge 5 commits intoxddq:mainfrom
SebastienGllmt:mutually-recursive

Conversation

@SebastienGllmt
Copy link
Copy Markdown
Contributor

@SebastienGllmt SebastienGllmt commented Apr 6, 2025

Previously, if you have any schema that was recursively defined (either simple recursion or mutually recursive), then the code would get stuck in an infinite loop until a Maximum call stack size exceeded error got thrown

Summary

To fix this, I used the (relatively new) Modules feature of Typebox. The benefits of using this feature are:

  1. The code is a lot cleaner than the Type.Recursive typebox feature (hence why things like Type.Recursive are somewhat deprecated and meant to move to the module system). See test/fixture/recursive.ts
  2. It also supports mutually recursive types instead of simple recursion
  3. not a breaking change. The end-result API of people who consume schema2typebox code is the same (same exports)
  4. Sets the ground work for a separate feature allowing to generate typebox for multiple json schemas at once without duplicating code (i.e. they could all just be put into one Typebox module)
  5. It makes it much easier to access inner schemas if you need to refer to their type in typescript land (see test/fixture/multi.ts

@xddq
Copy link
Copy Markdown
Owner

xddq commented Apr 7, 2025

Hi @SebastienGllmt, thanks for another PR : ]

Could you please provide me with a sample schema and output that fails in the current setup? If possible, mention your use case/scenario when you stumbled over this for me to better understand this.

Initial thoughts after skimming through the PR:

  1. looks like we should specify typebox as peer dependency instead for all following releases (since only higher versions have that feature available or rather, can make use of it). Which version is the minimal version we can specify for the module syntax? Which would you want to set as minimal peer dependency version and why?

  2. not a breaking change. The end-result API of people who consume schema2typebox code is the same (same exports)

I would argue that it is a "breaking change" since people with older typebox versions will simply not be able to use the generated code anymore. Good chance I might be wrong here, feel free to correct me. However, bumping the "feature" should be fine since we did not have any peer dependency set before anyway..? Thoughts?

  1. Other than that, looking good initially! Please note that I will probably need more time to review this properly and will probably only do this within the next weeks, not days. Feel free to ping me if I forgot this or did not get the time to look at it after appropriate time.

Edit: please also take a look at ci issues (I think first will get solved with correct peer dependency and dev dependency version, second is just formatting missing)

@SebastienGllmt
Copy link
Copy Markdown
Contributor Author

SebastienGllmt commented Apr 7, 2025

Could you please provide me with a sample schema and output that fails in the current setup

Yes, I added one as a test (recursive.json) which you can see here:
https://github.com/xddq/schema2typebox/pull/54/files#diff-8180a2313b13c125e09565d35c5e1278867661c88640740c5528107554e85526

Which version is the minimal version we can specify for the module syntax?

You need ^0.34.0 (the latest version of Typebox since this feature was only added last year)

we should specify typebox as peer dependency

makes sense. I've been using schema2typebox via npx, but I agree it makes sense to make the dependency setup work in cases where people want to add schema2typebox in their package.json. I'll look into the CI error in a bit

@xddq
Copy link
Copy Markdown
Owner

xddq commented Apr 19, 2025

So this PR will make every exported typebox code (regardless if it contains recursive refs or not) into Module types, correct? What are the drawbacks of this? Will the performance using the generated code for validating be similar/same? Does this somehow use way more memory?

Checking the docu it just mentions Module types as "Module types are containers for a set of referential types" src which to me sounds like this should be used for schemas containing references, but perhaos not for every typebox schema we generate..? Relying heavily on this could end up badly if there will be yet another replacement for Module syntax in typebox later on.

If I understand this correctly, one drawback/change would be that if user try to generate the json schema for the generated typebox code they suddenly end up with a different schema containing #def blocks and #ref blocks instead of "just the schema". I don't know how people use this library, but that could be annoying.

A few options I can think of:
a) implement/fix recursion without module syntax
b) only use modules if required
c) keep as is, release as 2.0 with recursion support and only generating module syntax schemas
d) don't add recursion support
e) ..?

Honestly I am leaning towards a) or d). Given we really have no major drawbacks when using the module syntax everywhere I would be somewhat down to go with c) as well and merge this into 2.0 after reviewing.

What are you thoughts on this?


just a sample schema for me to better understand this when reading again, dummy schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "CommentWithReplies",
  "title": "Comment",
  "type": "object",
  "properties": {
    "author": {
      "type": "string"
    },
    "message": {
      "type": "string"
    },
    "replies": {
      "type": "array",
      "items": {
        "$ref": "#"
      }
    }
  },
  "required": ["author", "message"]
}

@SebastienGllmt
Copy link
Copy Markdown
Contributor Author

SebastienGllmt commented Apr 19, 2025

So this PR will make every exported typebox code (regardless if it contains recursive refs or not) into Module types, correct?

Internally they are constructed as modules, but the exported type is the same as before (Module.Import extracts the type from the module)

Checking the docs it just mentions Module types as "Module types are containers for a set of referential types" src which to me sounds like this should be used for schemas containing references, but perhaps not for every typebox schema we generate..?

It's useful even for cases with no references to be able to aggregate many json schema files into one module (see #55)

Relying heavily on this could end up badly if there will be yet another replacement for Module syntax in typebox later on.

The code difference in this PR is not that big, so I don't think switching to an alternative would be a lot of work either

If I understand this correctly, one drawback/change would be that if user try to generate the json schema for the generated typebox code they suddenly end up with a different schema containing #def blocks and #ref blocks instead of "just the schema". I don't know how people use this library, but that could be annoying.

I tried to keep it the same by exporting Module.Import('...') (so end-users of the library don't have to know we used modules under-the-hood)

However, it turns out (see later in this post) that this still has some performance impact, so we can unwrap it further with Module.Import("<name>").$defs.<name>

What are the drawbacks of this? Will the performance using the generated code for validating be similar/same? Does this somehow use way more memory?

Here are the result of benchmarks I did

Validator

  • Value.Parse with Type.Number(): 1x (baseline)
  • Value.Parse with Type.Module({ bar: Type.Number() }): 2.2x
  • Value.Parse with Type.Module({ bar: Type.Number() }).$defs.bar: 1x

Initialization

Assuming you have the following type

Type.Object({
    author: Type.String(),
    message: Type.String(),
    replies: Type.Array(Type.Object({}))
  })

Performance

  • Using it as-is: 1x (baseline)
  • Type.Module({ bar: ... }): 17x

Memory

  • Using it as-is: 1x (baseline)
  • Type.Module({ bar: ... }): 2x (considering approximate object size)
  • Type.Module({ bar: ... }): 1.7x (considering total runtime memory taken by program)

Conclusion

From the runtime performance, you can see that there is basically no change if we unwrap the import ourselves (so maybe it's better to add unwraping with $defs.<name> in this PR). The only thing that changes in the initialization time (you only have to run a single time when you load the program).

You could argue that maybe we could generate different code to optimize the initialization time when there are no recursive references, but keep in mind

  • even with the Type.Module syntax, you can still initialize 500,000 object types in one second, so the impact on app startup will be trivial until they have thousands of types
  • when initializing 10 million entries in a loop, it takes 20kbs of memory using using Module.Import vs 12kbs of memory for the regular option, so the memory difference is be unnoticeable even with millions of types - although I assume the garbage collector mangles these benchmarks

Copy link
Copy Markdown
Owner

@xddq xddq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good overall ✨ Main points:

  • collect() feels a little too complicated
  • examples have to be regenerated
  • please only add recursive in this PR

Comment thread .prettierignore
bundle
dist
node_modules
test/fixture
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this added? Would prefer to have them formatted as well

Comment thread changelog.md
# 1.8.0

- migrate to the Typebox [modules](https://github.com/sinclairzx81/typebox?tab=readme-ov-file#types-modules) to support recursive types [src](https://github.com/xddq/schema2typebox/pull/51)
- export any inner schema with an `$id` instead of only the top-level type [src](https://github.com/xddq/schema2typebox/pull/51)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop the code specific to this feature in this PR. I want to focus on one feature at a time and the module one already takes quite some effort to review. In addition, I am not sure what the use case for this is/when this would be needed and if this should even be added. Open to discuss this later but for now, this should become a separate PR.

Suggested change
- export any inner schema with an `$id` instead of only the top-level type [src](https://github.com/xddq/schema2typebox/pull/51)

The code currently generates this

/**
 * ATTENTION. This code was AUTO GENERATED by schema2typebox.
 * While I don't know your use case, there is a high chance that direct changes
 * to this file get lost. Consider making changes to the underlying JSON schema
 * you use to generate this file instead. The default file is called
 * "schema.json", perhaps have a look there! :]
 */

import { Static, Type } from "@sinclair/typebox";

export const Module = Type.Module({
  Comment: Type.Object(
    {
      author: Type.String(),
      message: Type.String(),
      person: Type.Optional(Type.Ref("Person")),
      replies: Type.Optional(Type.Array(Type.Ref("Comment"))),
    },
    {
      $schema: "https://json-schema.org/draft/2020-12/schema",
      $id: "CommentWithReplies",
    }
  ),
  Person: Type.Object(
    {
      name: Type.String({ maxLength: 100 }),
      age: Type.Number({ minimum: 18 }),
    },
    { $id: "Person" }
  ),
});
export const Comment = Module.Import("Comment");
export type Comment = Static<typeof Comment>;
export const Person = Module.Import("Person");
export type Person = Static<typeof Person>;

based on these schemas

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "CommentWithReplies",
  "title": "Comment",
  "type": "object",
  "properties": {
    "author": {
      "type": "string"
    },
    "message": {
      "type": "string"
    },
    "person": {
      "$ref": "./person.json"
    },
    "replies": {
      "type": "array",
      "items": {
        "$ref": "#"
      }
    }
  },
  "required": [
    "author",
    "message"
  ]
}

{
  "$id": "Person",
  "title": "Person",
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "maxLength": 100
    },
    "age": {
      "type": "number",
      "minimum": 18
    }
  },
  "required": ["name", "age"]
}

But the generated output should be

/**
 * ATTENTION. This code was AUTO GENERATED by schema2typebox.
 * While I don't know your use case, there is a high chance that direct changes
 * to this file get lost. Consider making changes to the underlying JSON schema
 * you use to generate this file instead. The default file is called
 * "schema.json", perhaps have a look there! :]
 */

import { Static, Type } from "@sinclair/typebox";

export const Module = Type.Module({
  Comment: Type.Object(
    {
      author: Type.String(),
      message: Type.String(),
      person: Type.Optional(Type.Ref("Person")),
      replies: Type.Optional(Type.Array(Type.Ref("Comment"))),
    },
    {
      $schema: "https://json-schema.org/draft/2020-12/schema",
      $id: "CommentWithReplies",
    }
  ),
  Person: Type.Object(
    {
      name: Type.String({ maxLength: 100 }),
      age: Type.Number({ minimum: 18 }),
    },
    { $id: "Person" }
  ),
});
export const Comment = Module.Import("Comment");
export type Comment = Static<typeof Comment>;

Otherwise it will be highly likely that one would have duplicated exports. E.g. "Person" 10 times because it was referenced in 10 different schemas that were the basis for code generation.

Comment thread package.json
"@jest/globals": "29.7.0",
"@sinclair/typebox": "0.30.2",
"@sinclair/typebox": "0.34.33",
"@tsconfig/node18": "2.0.0",
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to set a peer dependency to @sinclair/typebox with version ^0.34 here, no? Otherwise people could use this programmatically with typebox 0.33 (or anything else) without a warning and without working code. The change should be somewhat similar to this.

Comment thread src/schema-to-typebox.ts
.join(",\n");
const typeAliases = Array.from(entries.keys())
.map((key) => {
return `export const ${key} = Module.Import('${key}');
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned above, please only export the first/top level schema (e.g. only Comment based on the example)

Comment thread src/schema-to-typebox.ts
schema: JSONSchema7Definition,
entries: ModuleEntries
): Code => {
if (typeof schema === "object" && schema.$id !== undefined) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to collect() feel a little complicated/confusing. Can you come up with something simpler/easier to understand/maintain? The other changes look fine.

E.g. The part here in l.86 looks pretty similar to l.129. Also, the typeof schema === "object" could perhaps become part of parseObject()? Do we really need the iife here?

Comment thread test/fixture/recursive.ts
@@ -0,0 +1,33 @@
/**
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

Comment thread test/schema.spec.ts
}
});

test("works for nested", async () => {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above, please remove from this pr

Comment thread tsconfig.json
},
"include": ["src", "test"],
"exclude": ["dist", "bundle", "node_modules"]
"exclude": ["dist", "bundle", "test/fixture", "node_modules"]
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for this? would prefer to not exclude this since it won't get type checked anymore

Comment thread test/fixture/dayOfWeek.ts
@@ -37,39 +37,42 @@ const OneOf = <T extends TSchema[]>(
oneOf,
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not in this file, but there were no changes to examples)

We need to recreate/update all examples as well since they will now use Module.

Comment thread src/schema-to-typebox.ts
export const collect = (
schema: JSONSchema7Definition,
entries: ModuleEntries
): Code => {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QoL: Just a thought. It feels a little off that collect returns Code but in the end we don't use the generated code. We simply use the entries.

Maybe (might be bad idea, you will know better) we can keep it consistent and add some kind of placeholder whenever we would use the references to recursive types and then postprocess the Code from collect by replacing these placeholders based on the entries map..?

@xddq
Copy link
Copy Markdown
Owner

xddq commented Apr 21, 2025

Thanks for the summary and the initial performance check. I think we should add this as you suggested to this PR

(so maybe it's better to add unwraping with $defs. in this PR)

Regarding the rest, seems reasonable to use Module everywhere. If people ever have problems, this could be changed later anyway.

@bastiion
Copy link
Copy Markdown

First of all thank you @xddq for the base work and @SebastienGllmt for the PR.
Very sad, that this PR never got merged - it is very useful and is holding me back from using schema2typebox in a production project. Recursion is something that I find over and over in the JSONSchema we use - and thus I would be happy to see any progress here. Is there any help needed?

@xddq
Copy link
Copy Markdown
Owner

xddq commented Sep 28, 2025

hey @bastiion

Is there any help needed?

You have the open PR and my comments. If you want to invest the time to get this shipped, feel free to open up another PR with the comments adressed. I won't put time into shipping this myself, however I am down to review in a timely manner. If you come up with a PR in the next (1-10) days, you can expect a review within a couple (1-5) of days.

It would be great if you could find an easier way to supporting recursive schemas (again, I might be totally wrong here) but it felt a little too complicated. If you put time into this and say this is the way to go, I am also fine. Would just be glad if this gets a closer look.

@SebastienGllmt
Copy link
Copy Markdown
Contributor Author

@bastiion I won't have time to work on this, so you're free to yoink my code and modify it as required to get it merged

@SebastienGllmt
Copy link
Copy Markdown
Contributor Author

SebastienGllmt commented Oct 17, 2025

FYI it looks like Type.Recursive will be removed in the upcoming version of Typebox (1.0.0), and replaced with Type.Cyclic (which works like modules under the hood: https://github.com/sinclairzx81/typebox/blob/main/changelog/1.0.0-migration.md#typerecursive)

@bastiion
Copy link
Copy Markdown

okay, then it was good to wait. At the moment I am working more with zod - but this thing gets interesting when working with types in Elysiajs Typebox I hope I will have time to look at it when 1.0.0 gets released and maybe this even opens up things like zod <--> typebox interoperability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants