Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching for "any" can't find "mopa" crate. #1407

Open
vi opened this issue May 31, 2018 · 8 comments
Open

Searching for "any" can't find "mopa" crate. #1407

vi opened this issue May 31, 2018 · 8 comments
Labels
A-backend ⚙️ A-keywords A-search C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works

Comments

@vi
Copy link

vi commented May 31, 2018

mopa's Cargo.toml:

keywords = ["any", "macro"]

Yet when I search for "any" in crates.io, I get "0 crates found. Get started and create your own.".

@sgrif
Copy link
Contributor

sgrif commented Jun 1, 2018

This is the behavior of PostgreSQL full text search, which we have very little control over. I believe in this case, "any" is a stop word.

Stop words are words that are very common, appear in almost every document, and have no discrimination value. Therefore, they can be ignored in the context of full text searching. For example, every English text contains words like a and the, so it is useless to store them in an index.

@sgrif sgrif closed this as completed Jun 1, 2018
@vi
Copy link
Author

vi commented Jun 1, 2018

Shall there be a policy about valid keywords in Cargo.toml? Maybe cargo publish should reject invalid (unsearchable) keywords?

What keyword should be used in relation to std::any::Any? Literally std::any::Any?

Also crates.io should show some dedicated message for invalid search requests, not just "nothing found".

@sgrif
Copy link
Contributor

sgrif commented Jun 1, 2018

Perhaps we should modify the query to check whether any of the keywords exactly match the search text.

@sgrif sgrif reopened this Jun 1, 2018
@vi
Copy link
Author

vi commented Jun 1, 2018

Anyway, if there are no search results due to search query being too short (or something), some special message should be shown, like "Too many results to display. Please narrow your search query".

@sgrif
Copy link
Contributor

sgrif commented Jun 1, 2018

This isn't about it being too short, it's about it only containing stop words. Unfortunately, PG gives us no way to actually detect whether a full text search query is ignored or not.

@vi
Copy link
Author

vi commented Jun 2, 2018

As a workaround, the list of words may be just copied from PG.

Pseudocode:

let number_of_search_results : usize;
let search_query : Set<String>;
let list_of_known_stopwords: Set<String>;
let message = if search_query.len() == 0 {
    "Search query is empty"
} else if search_query - list_of_known_stopwords == empty_set {
    "Search query contains only stopwords"
} else if number_of_search_results == 0 {
    if search_query & list_of_known_stopwords  != empty_set {
        "Nothing found. Note that following stopwords are ignored:" + (search_query & list_of_known_stopwords)
    } else {
        "Nothing found"
    }
} else {
        "Results:"
}

@carols10cents carols10cents added C-bug 🐞 Category: unintended, undesired behavior A-search A-postgres A-keywords labels Jun 27, 2018
@kzys
Copy link
Contributor

kzys commented Aug 27, 2019

Can't we change PostgreSQL's dictionary like https://stackoverflow.com/a/2227235?

@Turbo87 Turbo87 added C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works and removed C-bug 🐞 Category: unintended, undesired behavior labels Sep 26, 2021
@eth3lbert
Copy link
Contributor

I threw some ideas about a possible solution on zulip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ A-keywords A-search C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants