Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird search result order #337

Open
HonkingGoose opened this issue Oct 29, 2023 · 6 comments
Open

Weird search result order #337

HonkingGoose opened this issue Oct 29, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@HonkingGoose
Copy link
Collaborator

HonkingGoose commented Oct 29, 2023

What browser are you using?

Firefox

Other browser name

No response

Describe the bug

When searching on the docs site, the precise match for the dependencyDashboard config option sorts behind things like dependencyDashboardTitle. Usually a precise match sorts higher than partial matches. 🙃

Steps to reproduce

  1. Go to Renovate's docs site.
  2. Search for dependencyDashboard.
  3. The precise match for dependencyDashboard is not the first result.
  4. I would expect the precise match dependencyDashboard to sort higher.

dependencyDashboard-search-query

Additional context

Is our separator tokenization causing problems?

We changed Material for MkDocs's default search behavior (tokenization). Maybe that's related? Here's the relevant snippet from our mkdocs.yml config file:

plugins:
  - search:
      separator: '[\s\-,:!?=\[\]()<>{}"/\\]+|\.(?!\d)|&[lg]t;'

Related PRs for the separator thing:

@TWiStErRob you helped a lot before, do you want to brainstorm again? 😄

Material for MkDocs search boost feature?

Material for MkDocs has a "search boost" feature 1, but that applies to the whole page, not just a config option. They recommend starting with a low positive value first. For example:

---
search:
  boost: 2 


---

# Page title
...

Boosting the "config options docs page" probably causes other sorting issues... But I wanted to mention boosting, in case it inspires any ideas. 😄

Material for MkDocs improved search in future

The Material for MkDocs maintainer is working on better search. Right now Material uses the Lunr.js search engine. The maintainer is going to replace Lunr.js with something that's better for searching through a docs site. 2

Footnotes

  1. Material for MkDocs, search boost

  2. Material for MkDocs repo, maintainer is going to improve search

@HonkingGoose HonkingGoose added the bug Something isn't working label Oct 29, 2023
@TWiStErRob
Copy link

I noticed this as well a few weeks ago, just didn't fully realise it. The separator splitting is unlikely to help, because the tokens are full words as far as I know (we removed "case change" separator).

Boost is on the right track, but this might be straight up a lunr ranking issue. It would be interesting to reproduce on a smaller example with mkdocs first, and then try to strip out mkdocs around it to see if lunr ranks things correctly with direct usage.

@TWiStErRob

This comment was marked as off-topic.

@squidfunk
Copy link

squidfunk commented Oct 31, 2023

We're in the midst of reworking the search. Yes, Lunr.js search result order sometimes feels indeterministic, because BM25 scoring is far from ideal for typeahead. We're going to throw out Lunr.js soon. Search is currently the huge topic I'm working on. That being said, you can try to tweak it with the separator and boost settings in the meantime.

Additionally, I would kindly ask you to not mention me for such things. Next time, please create a discussion or an issue in squidfunk/mkdocs-material. You can probably imagine that I get mentioned a lot and I have to budget time for communication. If you use our discussion and issue boards, other users might help you as well, which gives me more time to work on new things, including the issue in the OP. Thank you!

@TWiStErRob
Copy link

Sure, thanks for the heads up! And the detailed and fast response!

@HonkingGoose
Copy link
Collaborator Author

HonkingGoose commented Nov 8, 2023

Thank you @squidfunk for the response and extra information! ❤️

I don't want to "mess around" with tokenization of the search again. Search seems to work OK in general, except that you can't narrow the search results by giving more keywords. Touching the tokenization also runs the risk of breaking other parts of the search. 🙈

I'll keep this issue open, so any potential bug reporters see it. For now, the easiest thing for us is to wait for Material for MkDocs's better search.

Edit: we'll probably want to follow this upstream issue:

@squidfunk
Copy link

🙋‍♀️ Please see squidfunk/mkdocs-material#6321 – feedback wanted!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants