Parse, correctly display RST #5

abesto · 2022-08-09T08:49:24Z

Big chunks of documentation text are currently not very good-looking :D Two big things here:

Format RST
Detect links to symbols and make them links on py.wtf (needs (Anchor) links to symbols #3) or standard Python docs where appropriate

Corollary: do we ever expect non-RST strings in documentation? If yes, those will need to get the same treatment.

zsol · 2022-08-09T12:11:58Z

Yes, markdown happens sometimes, I'd like to support that.

abesto · 2022-08-11T08:33:47Z

OK so... to do this well, we probably need some www-side processing, because the indexer doesn't (and shouldn't) know about the URL structure of the frontend (not to mention "should it be <h3> or <Text variant="h3">"). This means we have a few different options; they all end with "... into an AST on the client, and render it so that links to symbols point to the right location". Before that, we might

Parse raw RST (and MD) on the client. A quick search doesn't turn up any promising JS RST parsers, so I'm inclined to not do this.
Translate doc strings into some intermediate representation in the indexer; parse that on the client. This seems sane, as I expect adding more "frontends" (in the compiler sense of the word) to this process is easier in Python than in JS.
- That intermediate representation could be something abstract, like JsonML
- Or, more realistically, a well-supported markup language. It seems that Myst (a Markdown dialect for technical writing) may be a good candidate. It provides
  - A (Python) RST2MyST implementation, so we can pre-process in the indexer, leaving us with "doc strings are always either CommonMark or MyST (a superset of CommonMark)"
  - A JS MyST parser, which should eat both CommonMark and MyST transparently.

Am I maybe missing an obvious way of making this much less complex?

zsol · 2022-08-11T09:17:59Z

Agreed, there's definitely going to be www-side processing (this might be buildtime or runtime, no strong opinions there)

There's so many features of RST+Sphinx that might be relevant to us, I'd rather not invest in a custom IR where we'd have to implement an RST -> IR (and maybe MD -> IR).

MyST seems promising! Let's go with that. I also don't see any lighter-weight approach of doing this well.

abesto · 2022-08-13T08:34:05Z

First stab using the rst_to_myst Python library and the mystjs JS library (both provided by MyST itself) on top of 957ff37: https://gist.github.com/abesto/6ba417a03639f7b5da350a8a37294980

Not very promising.

rst_to_myst has Click 7 which conflicts with our Click 8. We can downgrade with no effort, or contribute a bump.
The conversion makes indexing S L O W
The output is... not great. It's like... the JS version doesn't eat the references?

zsol · 2022-09-02T11:33:05Z

#95 and #97 has addressed the bulk of this issue, #101 and #102 are opened to track more specific problems

abesto added www indexer labels Aug 11, 2022

zsol closed this as completed Sep 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse, correctly display RST #5

Parse, correctly display RST #5

abesto commented Aug 9, 2022

zsol commented Aug 9, 2022

abesto commented Aug 11, 2022 •

edited

Loading

zsol commented Aug 11, 2022

abesto commented Aug 13, 2022

zsol commented Sep 2, 2022

Parse, correctly display RST #5

Parse, correctly display RST #5

Comments

abesto commented Aug 9, 2022

zsol commented Aug 9, 2022

abesto commented Aug 11, 2022 • edited Loading

zsol commented Aug 11, 2022

abesto commented Aug 13, 2022

zsol commented Sep 2, 2022

abesto commented Aug 11, 2022 •

edited

Loading