Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse, correctly display RST #5

Closed
abesto opened this issue Aug 9, 2022 · 5 comments
Closed

Parse, correctly display RST #5

abesto opened this issue Aug 9, 2022 · 5 comments

Comments

@abesto
Copy link
Collaborator

abesto commented Aug 9, 2022

image

Big chunks of documentation text are currently not very good-looking :D Two big things here:

  • Format RST
  • Detect links to symbols and make them links on py.wtf (needs (Anchor) links to symbols #3) or standard Python docs where appropriate

Corollary: do we ever expect non-RST strings in documentation? If yes, those will need to get the same treatment.

@zsol
Copy link
Owner

zsol commented Aug 9, 2022

Yes, markdown happens sometimes, I'd like to support that.

@abesto
Copy link
Collaborator Author

abesto commented Aug 11, 2022

OK so... to do this well, we probably need some www-side processing, because the indexer doesn't (and shouldn't) know about the URL structure of the frontend (not to mention "should it be <h3> or <Text variant="h3">"). This means we have a few different options; they all end with "... into an AST on the client, and render it so that links to symbols point to the right location". Before that, we might

  • Parse raw RST (and MD) on the client. A quick search doesn't turn up any promising JS RST parsers, so I'm inclined to not do this.
  • Translate doc strings into some intermediate representation in the indexer; parse that on the client. This seems sane, as I expect adding more "frontends" (in the compiler sense of the word) to this process is easier in Python than in JS.
    • That intermediate representation could be something abstract, like JsonML
    • Or, more realistically, a well-supported markup language. It seems that Myst (a Markdown dialect for technical writing) may be a good candidate. It provides
      • A (Python) RST2MyST implementation, so we can pre-process in the indexer, leaving us with "doc strings are always either CommonMark or MyST (a superset of CommonMark)"
      • A JS MyST parser, which should eat both CommonMark and MyST transparently.

Am I maybe missing an obvious way of making this much less complex?

@zsol
Copy link
Owner

zsol commented Aug 11, 2022

Agreed, there's definitely going to be www-side processing (this might be buildtime or runtime, no strong opinions there)

There's so many features of RST+Sphinx that might be relevant to us, I'd rather not invest in a custom IR where we'd have to implement an RST -> IR (and maybe MD -> IR).

MyST seems promising! Let's go with that. I also don't see any lighter-weight approach of doing this well.

@abesto
Copy link
Collaborator Author

abesto commented Aug 13, 2022

First stab using the rst_to_myst Python library and the mystjs JS library (both provided by MyST itself) on top of 957ff37: https://gist.github.com/abesto/6ba417a03639f7b5da350a8a37294980

Not very promising.

  • rst_to_myst has Click 7 which conflicts with our Click 8. We can downgrade with no effort, or contribute a bump.
  • The conversion makes indexing S L O W
  • The output is... not great. It's like... the JS version doesn't eat the references?
    image

@zsol
Copy link
Owner

zsol commented Sep 2, 2022

#95 and #97 has addressed the bulk of this issue, #101 and #102 are opened to track more specific problems

@zsol zsol closed this as completed Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants