Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(build): Separate markdown and html handling into stages #1675

Open
wants to merge 1 commit into
base: v4
Choose a base branch
from

Conversation

necauqua
Copy link
Contributor

@necauqua necauqua commented Dec 25, 2024

So aliases were completely broken imo

There's #904, where obsidian-like wikilink resolution ("shortest") didn't work for aliases - which was an easy, if hacky, fix - if there is a better way to augment allSlugs I'm all ears.

And then the HTML redirect was breaking SPA and popovers for which I managed to figure out a simple fix actually.

Now aliases work great, I'm very pleased 🙃

First run all markdown transformers on all files, and then run all
html transformers on all files.

This allows us to augment allSlugs from a markdown processor, so that
we can add alias slugs from frontmatter, to be later used in the linker
html processor when transforming links.

For this to work with workers I had to write a very cringe hack.
To be able to build the allSlugs list correctly not only from filenames,
but also from frontmatter without such measures would require a complete
build process overhaul I feel like.

@necauqua
Copy link
Contributor Author

Ok one thing missing is aliases don't contribute to the graph/backlinks, will try to figure that one out bit later

@necauqua necauqua force-pushed the completely-fix-aliases branch from 198b494 to ecd78b1 Compare December 25, 2024 15:52
@necauqua
Copy link
Contributor Author

In this forcepush I actually found a slightly better place to augment the allSlugs, now it just looks great.

Looking for the graph/backlinks fix, which is the last thing remaining for the aliases to not have any issues whatsoever

@necauqua
Copy link
Contributor Author

necauqua commented Dec 25, 2024

Found an issue that I'm not sure I can fix - for allSlugs to be correctly fully populated, markdown plugins have to run over all pages and only then the html plugins have to start.

Currently there seems to be a data race where some html links are generated before the frontmatter transformer runs on their target and puts its aliases into allSlugs.

I guess I can make a hacky fix for this - at least for myself - by running a custom pre-pass on all content that parses the frontmatter only to get the aliases from it and populates allSlugs, but I'm not sure there's no better way.

@necauqua
Copy link
Contributor Author

I did the split I mentioned - and it completely fixed the issue, now I'm happy, aliases work fantastically now.

But I had to write a very crude hack to make the allSlugs thing work with concurrency (because the variable is obviously not shared between workers), and I feel like making it work properly is would take a large build process revamp.

Feel free to ignore the last commit - although, well, the first commit kinda depends on it to not be broken.
I realised the first and last commit are only needed for the "shortest" link resolution, and maybe people that don't use Obsidian don't care - well at least the second commit is an independent fix for SPA and popovers.

Sadly the third commit (fixing the graph/backlinks) also depends on the first one 🤷

Anyway, I think I'm done here, you may merge this and live with the hack, or people may use this MR if they need working aliases I guess

@aarnphm
Copy link
Collaborator

aarnphm commented Dec 26, 2024

I'm so confused about this PR, what is broken?

can you write a simple case where it doesn't work?

Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, but this PR seems all over the places, it felt like you tried to include 3 different things in here.

How does alias fixes related to the change in parse?

and the worker seems to be a completely different PR.

quartz/processors/parse.ts Show resolved Hide resolved
@necauqua
Copy link
Contributor Author

necauqua commented Dec 26, 2024

include 3 different things in here.

That's why it's split into 4 commits?..

Ah yeah github has not the best review UI, it'd make way more sense if you look at commits one by one if you didn't do that.

what is broken?

Uhm, oh yeah you also probably didn't read full commit messages, there's a bit more details there.
But basically if you use Obsidian and use any aliases at all a few glaring issues show up immediately.

a simple case

Sure, so make a page in a folder, for example Folder/Page.md, have it have alias: Alias in the frontmatter.
In the homepage, make a wikilink [[Alias]].

  • Firstly, markdownLinkResolution: "shortest" does not work, so because the page is in a different folder, [[Alias]] would just 404, and you have to do [[Folder/Alias]].
    This is the issue described in AliasRedirects not working when files are moved into folders #904.
    It is fixed by the first and the last commits, the idea is that we need to add alias slugs to the allSlugs list.
    Sadly requires the parse change to do markdown and html parsing of the entire set of files separately - because the html link transformation of [[Alias]] in the homepage must happen after the frontmatter markdown parsing of Folder/Page.md - without the parse chage resolution for some aliases works and for some doesn't, it's a data race.

  • Now, suppose we fix or ignore the link resolution issue, so we have an [[Alias]] link that kind of works.
    When you click on it, it reloads the whole page - kind of twice actually because of the HTML redirect - with a pretty jarring flicker, esp. compared to the rest of links working fluidly in the SPA.
    And when you hover it there's no popover.
    This turned out to be a simple fix in the second commit - it's actually independent so if you want I can move it to a separate PR or you can just cherry-pick it.

  • And lastly if you have [[Alias]] then the homepage graph does not show an edge to Page, and Page backlinks do not contain the homepage.
    Also a relatively simple fix, done in the third commit - however that one does depend on alias parsing from the first commit to avoid doing extra work - again, I can move it to a separate PR and recompute the alias slugs from frontmatter in contentIndex.ts

@necauqua
Copy link
Contributor Author

and the worker seems to be a completely different PR

I do agree actually (well at least it's a separate commit), but the first commit is kind of broken without it.

OTOH the first commit doesn't make things worse - although less consistent.

Also I made like seven small fix PRs, I felt it makes sense to at least group all the alias fixes in one 😂

@aarnphm
Copy link
Collaborator

aarnphm commented Dec 26, 2024

I do agree actually (well at least it's a separate commit), but the first commit is kind of broken without it.

OTOH the first commit doesn't make things worse - although less consistent.

Also I made like seven small fix PRs, I felt it makes sense to at least group all the alias fixes in one 😂

You can still separate into different PR, then mention it in the description to tell us what the order for merging.

@aarnphm
Copy link
Collaborator

aarnphm commented Dec 26, 2024

fwiw I don't think we don't have to separate the parser. If you want to update the allSlugs then we can just do it in the frontmatter transformers.

at this point, frontmatter is considered to be the core regardless and always get parsed first so you can assume all of the de/alias and frontmatter item will get propagated there.

Not sure what you meant by "transformers for markdown and html will be parsed separately" here.

we should run all markdownplugins => htmlplugins

@necauqua
Copy link
Contributor Author

Not sure what you meant by "transformers for markdown and html will be parsed separately" here.

Right now, "markdownplugins => htmlplugins" runs for every file, but not for all files separately.

So for two pages A and B this is the order of what currently happens:
markdown A, html A, markdown B, html B

What has to happen for allSlugs thing to work correctly is:
markdown A, markdown B, html A, html B

This is because html A can depend on markdown B if it links to AliasB, and if you look at the first order above you can see how this breaks - alias slugs from markdown B are not yet added to allSlugs when html A runs.

I hope that answers the questions of why I had to separate the parser in two stages, and how it's unavoidable

@necauqua
Copy link
Contributor Author

necauqua commented Dec 26, 2024

You can still separate into different PR, then mention it in the description to tell us what the order for merging

GitHub is actually awful for stacked (dependent on one another) PRs.

So right now we have 4 commits here:

  1. alias link resolution fix with allSlugs
  2. SPA+popover fix
  3. graph/backlinks fix
  4. parser split thing for 1. to actually work

If you insist on making separate PRs here's my plan:

  • Move 2. out into a separate PR, it's independent, easy.
  • Remove 3. from here.
  • Now we're left with 1 and 4 in this PR, and I'll make a PR for 3. after this gets merged, because 3. depends on code changes from 1.
  • Optionally if you want I can move 4. into another PR as well

edit:
just realised we can merge 4. before 1. lol
ok when I get to my PC I'll split this into 3 PRs (without 3.) and mention that 1. has to be merged after 4., and after 4. and 1. are merged I'll make another PR for 3. :)

@necauqua necauqua force-pushed the completely-fix-aliases branch from ae891ef to 1ae1f30 Compare December 26, 2024 22:16
@necauqua necauqua changed the title fix(aliases): Fix issues with aliases fix(build): Separate markdown and html handling into stages Dec 26, 2024
@necauqua
Copy link
Contributor Author

Split this thing into 4 PRs with some of them depending on others, it does indeed look a bit nicer I guess (I mean the diff view are the same as if you'd look at the commits in one PR separately).

I'm assuming you're ok with leaving the filter(..).flatMap(..) => flatMap(..) thing, so I resolved those review comments, feel free to un-okay it :)

@jackyzha0
Copy link
Owner

passby review to say this is nice, aliases didnt get much thought put into it so thanks for trying to fix it haha

@aarnphm aarnphm self-requested a review December 27, 2024 00:07
Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny things, otherwise lgtm

quartz/processors/parse.ts Outdated Show resolved Hide resolved
Comment on lines +21 to +23
// this is a hack
// we assume markdown parsers can add to `allSlugs`,
// but don't actually use them
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big assumption imho.

I think there is no preventing users from using allSlugs in markdown in this case?

Copy link
Contributor Author

@necauqua necauqua Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the thing is if some markdown transformer adds to allSlugs and some other uses allSlugs - there will always be ordering issues similar to what I'm trying to prevent with html

So basically it makes little sense to do that.

This could kind of be fixed by assigning priorities to transformers and then doing multiple stages for all files (again literally the same thing as I'm doing with html, but multiple times), but that's a whole lot of work for no gain as we don't use that anyway.

And what I just said also makes no sense with workers as we actually need to have a concurrently shared list that different workers need to add to and read from(!) which I've no idea how to do (imagine having normal threads and synchronization tools in node) - and again, a whole lotta complexity for "correctness", or this little (but very ugly) hack we have here 😅

Actually this ugly hack will still work lol, so yeah, we can have the priorities thing, but we don't need it thankfully, as only splitting md and html processors was enough

@necauqua
Copy link
Contributor Author

This coincidentally is also a requirement for dataview lol (yes I'm looking into it)

The dataview transformer needs to know the frontmatter (and parse the content for inline fields) of all the files 🙃

@necauqua necauqua force-pushed the completely-fix-aliases branch from 1ae1f30 to f28ba5b Compare December 27, 2024 20:03
@necauqua
Copy link
Contributor Author

necauqua commented Dec 27, 2024

Figured why the types didn't align - PluggableList that .use accepts erases the types sadly, so nothing we could really do.

Also improved the transformer types a little (well was trying to fix types before realising you can't), now in md transformers we know that the node is MdRoot and in html transformers is HtmlRoot

Undid that because it was breaking at least the GFM plugin and I've no mental capacity to look into that eh

@necauqua necauqua force-pushed the completely-fix-aliases branch 2 times, most recently from fb91766 to ff19670 Compare December 27, 2024 20:12
@necauqua
Copy link
Contributor Author

necauqua commented Dec 30, 2024

Hey, is there anything left for me to do to get this (and all the other things I did on top lul) merged?

Would love to stop being on top of an unholy megamerge of a dozen of my little fixes/features locally 😂

There are more fixes/features I plan to push that depend on things I already pushed too (not depend depend, but dealing with like conflicts and such could be annoying)

edit: Oh wait I literally did not realise it's christmas/new year holidays, no pressure lol 😅

Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finals few

quartz/plugins/vfile.ts Outdated Show resolved Hide resolved
quartz/processors/parse.ts Outdated Show resolved Hide resolved
quartz/processors/parse.ts Outdated Show resolved Hide resolved
quartz/processors/parse.ts Outdated Show resolved Hide resolved
quartz/processors/parse.ts Show resolved Hide resolved
quartz/worker.ts Show resolved Hide resolved
@necauqua necauqua force-pushed the completely-fix-aliases branch from ff19670 to 0226692 Compare January 2, 2025 20:59
First run all markdown transformers on all files, and then run all
html transformers on all files.

This allows us to augment `allSlugs` from a markdown processor, so that
we can add alias slugs from frontmatter, to be later used in the linker
html processor when transforming links.

For this to work with workers I had to write a very cringe hack.
To be able to build the allSlugs list correctly not only from filenames,
but also from frontmatter without such measures would require a complete
build process overhaul I feel like.
@necauqua necauqua force-pushed the completely-fix-aliases branch from 0226692 to 6a05545 Compare January 2, 2025 21:17
@aarnphm aarnphm requested a review from jackyzha0 January 4, 2025 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants