You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for this excellent library. So far, I have successfully implemented symspell in categorization algorithm. It works well and fast. I am looking for suggestions on how to improve my current algorithm for substring search:
I am using a list of keywords as a dictionary. The words that are misspelled or truncated are changed to the keywords, which determine the category of a string. For example 'salar for April', 'Life Insuranse' are changed to 'salary for April' and 'Life Insurance', respectfully, since 'salary' and 'insurance' are in the keywords list. However, some of the strings are not only misspelled, but also missing spaces or there are too many mistakes. So, 'salaryfor April', 'LifeInsurance' and 'salaryyyy' are not recognized and, therefore, cannot be categorized by the current solution. Using the whole vocabulary as a dictionary is not feasible. Instead, I want to find a way to implement substring search, which would help me to find strings that contain certain substrings such as 'salar', 'insuran', 'accommod' and so on.
Can symspell be utilized for substring search? Or maybe you have other suggestions on how to effectively implement this idea and combine it with symspell?
Thank you in advance
The text was updated successfully, but these errors were encountered:
Hi,
Thank you for this excellent library. So far, I have successfully implemented symspell in categorization algorithm. It works well and fast. I am looking for suggestions on how to improve my current algorithm for substring search:
I am using a list of keywords as a dictionary. The words that are misspelled or truncated are changed to the keywords, which determine the category of a string. For example 'salar for April', 'Life Insuranse' are changed to 'salary for April' and 'Life Insurance', respectfully, since 'salary' and 'insurance' are in the keywords list. However, some of the strings are not only misspelled, but also missing spaces or there are too many mistakes. So, 'salaryfor April', 'LifeInsurance' and 'salaryyyy' are not recognized and, therefore, cannot be categorized by the current solution. Using the whole vocabulary as a dictionary is not feasible. Instead, I want to find a way to implement substring search, which would help me to find strings that contain certain substrings such as 'salar', 'insuran', 'accommod' and so on.
Can symspell be utilized for substring search? Or maybe you have other suggestions on how to effectively implement this idea and combine it with symspell?
Thank you in advance
The text was updated successfully, but these errors were encountered: