Problem Description:
While the following idea was born from an extreme example, it could be beneficial when the system is under strain and users could get faster responses for qualifying searches.
#633 documents a search with 217 words. When not quoted, the search takes 25-26 seconds, which is greater than the 20 second time out. MarkLogic is removing 95 stop words, but that still leaves 122 that have the keyword search pattern applied to them, resulting in 122 words being individually checked in the referenceName index (>22 mln values) of all documents linked to by the lux('itemAny') predicate (>20 mln docs). That's a lot of index look ups and ensuing joins.
#635's optimization idea would not have helped with this particular search because the keyword search pattern includes a values function.
Expected Behavior/Solution:
After removing stop words, de-duplicate the remaining words and phrases. Restrict to criteria resolved against the same document set and in the same grouping (AND or OR). In #633's case, that would have got us down to 105 or 106 depending on case-sensitivity. Speculating, that could have allowed the query to complete in 87% of the time, which is not necessarily enough to come in under the current timeout for search.
Requirements:
See above.
Needed for promotion:
If an item on the list is not needed, it should be crossed off but not removed.
UAT/LUX Examples:
Dependencies/Blocks:
- Blocked By: Nothing
- Blocking: Better performance for qualifying searches.
Related Github Issues:
None at the time of submission.
Related links:
None at the time of submission.
Wireframe/Mockup:
N/A
Problem Description:
While the following idea was born from an extreme example, it could be beneficial when the system is under strain and users could get faster responses for qualifying searches.
#633 documents a search with 217 words. When not quoted, the search takes 25-26 seconds, which is greater than the 20 second time out. MarkLogic is removing 95 stop words, but that still leaves 122 that have the keyword search pattern applied to them, resulting in 122 words being individually checked in the
referenceNameindex (>22 mln values) of all documents linked to by thelux('itemAny')predicate (>20 mln docs). That's a lot of index look ups and ensuing joins.#635's optimization idea would not have helped with this particular search because the keyword search pattern includes a values function.
Expected Behavior/Solution:
After removing stop words, de-duplicate the remaining words and phrases. Restrict to criteria resolved against the same document set and in the same grouping (AND or OR). In #633's case, that would have got us down to 105 or 106 depending on case-sensitivity. Speculating, that could have allowed the query to complete in 87% of the time, which is not necessarily enough to come in under the current timeout for search.
Requirements:
See above.
Needed for promotion:
If an item on the list is not needed, it should be crossed off but not removed.
UAT/LUX Examples:
Dependencies/Blocks:
Related Github Issues:
None at the time of submission.
Related links:
None at the time of submission.
Wireframe/Mockup:
N/A