-
-
Notifications
You must be signed in to change notification settings - Fork 826
Optimization when posting list are saturated. #2745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| #[derive(Default, Copy, Clone, Debug)] | ||
| struct AllAndEmptyScorerCounts { | ||
| all_count: usize, | ||
| empty_count: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| empty_count: usize, | |
| num_empty_scorer: usize, |
|
|
||
| #[derive(Default, Copy, Clone, Debug)] | ||
| struct AllAndEmptyScorerCounts { | ||
| all_count: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| all_count: usize, | |
| num_all_scorer: usize, |
src/query/boolean_query/mod.rs
Outdated
| index_writer.add_document(doc!(text_field=>"hello happy"))?; | ||
| index_writer.commit()?; | ||
| let searcher = index.reader()?.searcher(); | ||
| let term_a: Box<dyn Query> = Box::new(TermQuery::new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let term_a: Box<dyn Query> = Box::new(TermQuery::new( | |
| let hit_all_term: Box<dyn Query> = Box::new(TermQuery::new( |
src/query/boolean_query/mod.rs
Outdated
| Term::from_field_text(text_field, "hello"), | ||
| IndexRecordOption::Basic, | ||
| )); | ||
| let term_b: Box<dyn Query> = Box::new(TermQuery::new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let term_b: Box<dyn Query> = Box::new(TermQuery::new( | |
| let hit_some_term: Box<dyn Query> = Box::new(TermQuery::new( |
src/query/boolean_query/mod.rs
Outdated
| Term::from_field_text(text_field, "happy"), | ||
| IndexRecordOption::Basic, | ||
| )); | ||
| let term_c: Box<dyn Query> = Box::new(TermQuery::new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let term_c: Box<dyn Query> = Box::new(TermQuery::new( | |
| let hit_none_term: Box<dyn Query> = Box::new(TermQuery::new( |
src/query/term_query/term_weight.rs
Outdated
| mut boost: Score, | ||
| ) -> crate::Result<SpecializedScorer> { | ||
| if !self.scoring_enabled { | ||
| boost = 1.0f32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
If a posting list doc freq is the segment reader's max_doc, and if scoring does not matter, we can replace it by a AllScorer. In turn, in a boolean query, we can dismiss all scorers and empty scorers, to accelerate the request.
a7212d5 to
dc4c218
Compare
src/query/term_query/term_weight.rs
Outdated
| scoring_enabled: bool, | ||
| } | ||
|
|
||
| pub(crate) enum SpecializedScorer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpecializedTermScorer?
We already have SpecializedScorer in boolean_weight.rs, that's confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it to something explicit, removed it from the pub(crate) api and made it private.
| let positive_scorer = match (should_scorers, must_scorers) { | ||
| (ShouldScorersCombinationMethod::Ignored, must_scorers) => { | ||
| let boxed_scorer: Box<dyn Scorer> = if must_scorers.is_empty() { | ||
| if must_special_scorer_counts.all_count + should_special_scorer_counts.all_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment would be good here
| (CombinationMethod::Ignored, Some(must_scorers)) => { | ||
| SpecializedScorer::Other(intersect_scorers(must_scorers, num_docs)) | ||
|
|
||
| let positive_scorer = match (should_scorers, must_scorers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let positive_scorer = match (should_scorers, must_scorers) { | |
| let include_scorer = match (should_scorers, must_scorers) { |
195c5f1 to
814ec2f
Compare
814ec2f to
e864338
Compare
* Optimization when posting list are saturated. If a posting list doc freq is the segment reader's max_doc, and if scoring does not matter, we can replace it by a AllScorer. In turn, in a boolean query, we can dismiss all scorers and empty scorers, to accelerate the request. * Added range query optimization * CR comment * CR comments * CR comment --------- Co-authored-by: Paul Masurel <[email protected]>
* Optimization when posting list are saturated. If a posting list doc freq is the segment reader's max_doc, and if scoring does not matter, we can replace it by a AllScorer. In turn, in a boolean query, we can dismiss all scorers and empty scorers, to accelerate the request. * Added range query optimization * CR comment * CR comments * CR comment --------- Co-authored-by: Paul Masurel <[email protected]>
If a posting list doc freq is the segment reader's max_doc, and if scoring does not matter, we can replace it by a AllScorer.
In turn, in a boolean query, we can dismiss all scorers and empty scorers, to accelerate the request.