-
Notifications
You must be signed in to change notification settings - Fork 15
feat(mdxish): add new MDXish engine #1243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
- created more tests
feat: add tests, stubs and exports
feat: first pass at migrating over mdxish code
fix: manually parse variable nodes
fix: fix embed blocks
|
We have created a documentation file that explains the flow of the new engine and all its moving parts (mostly). See it here directly -> |
| export { default as mdastV6 } from './mdastV6'; | ||
| export { default as mdx } from './mdx'; | ||
| export { default as mix } from './mix'; | ||
| export { default as mdxish } from './mdxish'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the export on L15 is the main entry point for the new engine. As mentioned above these mimic the existing run and compile for the existing mdx function.
mixreturns stringified HTMLmdxishreturns HASTrenderMdxishtakes in HAST and spits out React components
|
|
||
| const processedContent = preprocessJSXExpressions(mdContent, jsxContext); | ||
|
|
||
| const processor = unified() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main processor for mdxish. Its a recursive process to handle processing children nodes. For more details on the plugins:
| Phase | Plugin | Purpose |
|---|---|---|
| Pre-process | preprocessJSXExpressions |
Evaluate {expressions} before parsing |
| MDAST | remarkParse |
Markdown → AST |
| MDAST | remarkFrontmatter |
Parse YAML frontmatter (metadata) |
| MDAST | defaultTransformers |
Transform callouts, code tabs, images, gemojis |
| MDAST | mdxishComponentBlocks |
PascalCase HTML → mdxJsxFlowElement |
| MDAST | embedTransformer |
[label](url "@embed") → embedBlock nodes |
| MDAST | variablesTextTransformer |
{user.*} → <Variable> nodes (regex-based) |
| MDAST | tailwindTransformer |
Process Tailwind classes (conditional, if useTailwind) |
| MDAST | remarkGfm |
GitHub Flavored Markdown: tables, strikethrough, task lists, autolinks, footnotes |
| Convert | remarkRehype + handlers |
MDAST → HAST |
| HAST | rehypeRaw |
Raw HTML strings → HAST elements |
| HAST | rehypeSlug |
Add IDs to headings |
| HAST | rehypeMdxishComponents |
Match & transform custom components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small note on the remarkRehype plugin. Since we are not using remarkMdx, remarkRehype doesn't know how to handle MDX nodes by default. We pass in our custom handler (mdxComponentHandlers) to convert the mdxJsxFlowElement nodes to HAST elements.
| .use(calloutTransformer) | ||
| .use(mdxishComponentBlocks) | ||
| .use(embedTransformer) | ||
| .use(variablesTextTransformer) // we cant rely in remarkMdx to parse the variable, so we have to parse it manually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small note on variablesTextTransformer, we cant use the existing variables transformer because this expects MDX as the input and requires the plugin remarkMdx. Since we cannot use MDX, we have to create a new transformer that is text-based instead of MDX-based.
variables.ts |
variables-text.ts |
|
|---|---|---|
| Parser | Relies on remarkMdx |
Uses regex |
| Input nodes | mdxFlowExpression, mdxTextExpression |
text |
| Pipeline | Full MDX (run.tsx) |
mdxish (lightweight) |
| Dependency | Needs remarkMdx + ESTree parsing |
No MDX dependency |
Both produce the same output: Variable nodes with hName: 'Variable' and hProperties: { name: fieldName }.
| .use(remarkRehype, { allowDangerousHtml: true, handlers: mdxComponentHandlers }) | ||
| .use(rehypeRaw) | ||
| .use(rehypeSlug) | ||
| .use(rehypeMdxishComponents, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rehypeMdxishComponents is also a custom plugin we created to handle the final part of the pipeline. Tldr, it mimics what MDX does overall and does the following:
- Component matching
- Prop normalization (e.g.,
classtoclassName) - Process children node
|
|
||
| const tocHast = headings.length > 0 ? tocToHast(headings, MAX_DEPTH) : null; | ||
|
|
||
| return buildRMDXModule(content, headings, tocHast, contextOpts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renderMdxish converts a HAST tree (from mdxish) into a React component module. The pipeline:
loadComponents()+ merge user components → merged component mapextractToc(tree, components)→ headings arrayexportComponentsForRehype(components)→ flattened component map for rehype-reactcreateRehypeReactProcessor(componentsForRehype)→ unified processorprocessor.stringify(tree)→ React.ReactNode contenttocToHast(headings, MAX_DEPTH)→ TOC HAST structurebuildRMDXModule(content, headings, tocHast, contextOpts)→ final RMDXModule
Output: RMDXModule with default (main component), toc (heading data), and Toc (TOC component).
For the complete call tree, refer to this graph
HAST Tree (input)
│
├─→ extractToc() → headings[]
│ │
│ └─→ tocToHast() → tocHast
│
└─→ exportComponentsForRehype()
│
└─→ createRehypeReactProcessor()
│
└─→ processor.stringify() → React.ReactNode (content)
│
│
buildRMDXModule(content, headings, tocHast, opts) ←──────┘
│
└─→ RMDXModule {
default: DefaultComponent,
toc: headings,
Toc: TocComponent
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This custom plugin is needed because:
- Remark parses unknown tags as raw HTML; we rewrite them so downstream
MDX/rehype tooling treats them as components (supports self-closing and wrapped content) - If there are empty lines inside the components, the remark parsers might incorrectly construct the tree; e.g. content beneath the component might get bundled up. So this plugin cleans the content up to prevent such cases
fix: fixed some skipped tests and add `remarkGfm` and `remarkFrontmatter`
🧰 Changes
Note
A few notes to point before everything else. This PR adds logic for a new engine and only does that (rendering stuff). Attaching the new engine for validation and editing markdown will be in separate PRs
Context
This PR exports 2 new libraries which provides a new way to render mixed Markdown + MDX content in our application.
This allows customers to flexibly embed MDX inside Markdown without relying on the strict MDX renderer or needing to migrate everything to MDX (which currently causes many errors and requires hours of cleanup)
Important
With the addition of the new libraries, we unfortunately have exceeded the maximum bundle size allowed. Specifically the current bundle size is
762KBand the limit was750KB, this has been increased to775KBChanges
mdxish.tsrenderMdxish.tsxrun.tsxbehaviour used in production, returns an RMDXModule which contains the content react component, and the table of contents🧬 QA & Testing
How to Test
To test this new rendering engine directly in the ReadMe app:
npm ci&&npm run buildmake link-markdownconfig/development.js, set themdx.server.enabledvariable to false to disable MDX validation and allow content to render in view modetests/lib/mdxish/demo-docsas examples in your editorThings to Test in Docs
<br>)📸 Some Screenshots
These screenshots are sample MD/MDX pages that is rendered using the new libraries. All screenshots here and all demo does not have correct validation yet. We purposefuly disabled validation to demo this new engine/library.
Missing components do not immediately error the entire page