Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node tableHead not parsed correctly #14

Closed
4 tasks done
diervo opened this issue Aug 7, 2024 · 6 comments
Closed
4 tasks done

Node tableHead not parsed correctly #14

diervo opened this issue Aug 7, 2024 · 6 comments
Labels
🤷 no/invalid This cannot be acted upon 👎 phase/no Post cannot or will not be acted on

Comments

@diervo
Copy link

diervo commented Aug 7, 2024

Initial checklist

Affected packages and versions

[email protected]

Link to runnable example

No response

Steps to reproduce

The following code should produce a tableHead node:

fromMarkdown(text, {
        extensions: [gfm()],
        mdastExtensions: [gfmFromMarkdown()],
    });
| Column 1 | Column 2 | Column 3 |
| -------- | -------- | -------- |
| foo      | bar      | blob     |
| baz      | qux      | trust    |
| quux     | quuz     | glob     |

Expected behavior

fromMarkdown(text, {
        extensions: [gfm()],
        mdastExtensions: [gfmFromMarkdown()],
    });
    

### Actual behavior

TableRow is returned instead.
```json
{
    "type": "table",
    "align": [
        null,
        null,
        null
    ],
    "children": [
        {
            "type": "tableRow",
            "children": [
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "Column 1"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "Column 2"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "Column 3"
                        }
                    ]
                }
            ]
        },
        {
            "type": "tableRow",
            "children": [
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "foo"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "bar"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "blob"
                        }
                    ]
                }
            ]
        },
        {
            "type": "tableRow",
            "children": [
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "baz"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "qux"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "trust"
                        }
                    ]
                }
            ]
        },
        {
            "type": "tableRow",
            "children": [
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "quux"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "quuz"
                        }
                    ]
                },
                {
                    "type": "tableCell",
                    "children": [
                        {
                            "type": "text",
                            "value": "glob"
                        }
                    ]
                }
            ]
        }
    ]
}

Runtime

Node v16

Package manager

npm v7

OS

macOS

Build and bundle tools

Vite

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Aug 7, 2024
@diervo diervo changed the title Unable to receive tableHead type node Node tableHead not parsed correctly Aug 7, 2024
@ChristianMurphy
Copy link
Member

Welcome @diervo! 👋
Sorry you ran into some confusion.

A few things:

  1. Mdast is a markdown syntax tree, not HTML, there is no tableHead node type https://github.com/syntax-tree/mdast-util-gfm-table/tree/abf8c36f37e363194f994cc650a83ce1926c83f9?tab=readme-ov-file#nodes
  2. This package does not make the syntax tree https://github.com/syntax-tree/mdast-util-gfm-table does

I suspect your question is an XY problem, you probably want to do something with a remark plugin?
If so consider opening a discussion with some context on what you are trying to solve https://github.com/orgs/remarkjs/discussions/categories/q-a

@ChristianMurphy ChristianMurphy closed this as not planned Won't fix, can't repro, duplicate, stale Aug 8, 2024
@ChristianMurphy ChristianMurphy added the 🤷 no/invalid This cannot be acted upon label Aug 8, 2024

This comment has been minimized.

@github-actions github-actions bot added 👎 phase/no Post cannot or will not be acted on and removed 🤞 phase/open Post is being triaged manually labels Aug 8, 2024
@diervo
Copy link
Author

diervo commented Aug 8, 2024

Hi @ChristianMurphy,

I understand a little bit about syntax trees, as I have worked quite a lot in the past with @wooorm while he was building micromark and the "formal specification" of markdown :)

The reason I filed the issue was mostly to get the team's take on whether having a new node of type tableHead to disambiguate between a regular tableRow and this new tableHead would be something that you would entertain.

Seems that micromark is already disambiguating and tokenizing it correctly, so seemed like a pretty straight-forward change in the AST generation

There are a lot of precedents of this disambiguation on similar "ast-like abstractions" (ex. ProseMirror), and it will allow for the user to configure whether or not they want to assume that the first row is a header or not.

@ChristianMurphy
Copy link
Member

ChristianMurphy commented Aug 8, 2024

Thanks for the clarification @diervo!
I'm open to the idea of a new node type.
That's a pretty significant change, which would touch a lot of repos.
I'd lean a bit towards it being an RFC https://github.com/unifiedjs/rfcs
To queue it up as a possibility for the next major release.

Would you be interested in drafting an RFC PR with your thoughts?

@diervo
Copy link
Author

diervo commented Aug 8, 2024

It will be hard for me to find the time, not necessarily for the RFC but to help push it through the whole process (I been in enough standardization comittes to know that it takes a village... xD)

But knowing that you are open to it, I will do my best to eventually find some time.

FWIW I think this change could be non-breaking if we default to the current behavior and add some options on top, but certainly it will require some non-trivial debate.

@wooorm
Copy link
Member

wooorm commented Aug 8, 2024

This change would go into mdast. And then be pulled in everywhere else.

There is indeed no tableHeader in mdast, which is intentional.
It existed years ago, and was removed: remarkjs/remark@593bb82. The main discussion then happened here: syntax-tree/mdast#6. These links are more for historical purposes, it’s years later and I’m sure I changed my mind on many things since. But some of that I still agree with.

I think ASTs should be simple. It should be hard to make mistakes in ASTs. With different names for these node types, it’s easier to accidentaly inject a tableHeader as the 2nd row. Or tableRow in the 1st. Or remove that 1st header row. What then?
But differentiating currently if needed, seems clear enough: index == 0.

Finally, as the ASTs and projects grow older, I think it’s sensical to be more stable.

What use cases does this really improve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤷 no/invalid This cannot be acted upon 👎 phase/no Post cannot or will not be acted on
Development

No branches or pull requests

3 participants