Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport][8.17] Fixes scroll helper - Backports #2557 #2560

Merged
merged 3 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
*See the full release notes on the official documentation website: https://www.elastic.co/guide/en/elasticsearch/client/ruby-api/current/release_notes.html*

## 8.17.1 Release notes

### Client

* Fixes ScrollHelper issue #2556 - There was a bug where an additional search (with scroll) request was made to Elasticsearch for each resulting hit. It was rewritten so that the docs are retrieved as needed and the Helper instance doesn't store documents internally, with big savings in memory and requests to Elasticsearch.

## 8.17.0 Release notes

### Client
Expand All @@ -13,6 +19,11 @@
* `rest_total_hits_as_int` (Boolean): Indicates whether hits.total should be rendered as an integer or an object in the rest search response.
* `open_point_in_time` - Adds `allow_partial_search_results` (Boolean) parameter: Specify whether to tolerate shards missing when creating the point-in-time, or otherwise throw an exception (default: false).

## 8.16.1 Release notes

### Client

* Fixes ScrollHelper issue #2556 - There was a bug where an additional search (with scroll) request was made to Elasticsearch for each resulting hit. It was rewritten so that the docs are retrieved as needed and the Helper instance doesn't store documents internally, with big savings in memory and requests to Elasticsearch.

## 8.16.0 Release notes

Expand Down
10 changes: 10 additions & 0 deletions docs/release_notes/816.asciidoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
[[release_notes_8_16]]
=== 8.16 Release notes

[discrete]
[[release_notes_8_16_1]]
=== 8.16.1 Release notes

[discrete]
==== Client

* Fixes ScrollHelper issue https://github.com/elastic/elasticsearch-ruby/issues/2556[#2556] - There was a bug where an additional search (with scroll) request was made to Elasticsearch for each resulting hit. It was rewritten so that the docs are retrieved as needed and the Helper instance doesn't store documents internally, with big savings in memory and requests to Elasticsearch.


[discrete]
[[release_notes_8_16_0]]
=== 8.16.0 Release notes
Expand Down
10 changes: 10 additions & 0 deletions docs/release_notes/817.asciidoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
[[release_notes_8_17]]
=== 8.17 Release notes

[discrete]
[[release_notes_8_17_1]]
=== 8.17.1 Release notes

[discrete]
==== Client

* Fixes ScrollHelper issue https://github.com/elastic/elasticsearch-ruby/issues/2556[#2556] - There was a bug where an additional search (with scroll) request was made to Elasticsearch for each resulting hit. It was rewritten so that the docs are retrieved as needed and the Helper instance doesn't store documents internally, with big savings in memory and requests to Elasticsearch.


[discrete]
[[release_notes_8_17_0]]
=== 8.17.0 Release notes
Expand Down
2 changes: 1 addition & 1 deletion elasticsearch-api/lib/elasticsearch/api/version.rb
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@

module Elasticsearch
module API
VERSION = '8.17.0'.freeze
VERSION = '8.17.1'.freeze
end
end
2 changes: 1 addition & 1 deletion elasticsearch/elasticsearch.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Gem::Specification.new do |s|
s.required_ruby_version = '>= 2.5'

s.add_dependency 'elastic-transport', '~> 8.3'
s.add_dependency 'elasticsearch-api', '8.17.0'
s.add_dependency 'elasticsearch-api', '8.17.1'

s.add_development_dependency 'base64'
s.add_development_dependency 'bundler'
Expand Down
18 changes: 4 additions & 14 deletions elasticsearch/lib/elasticsearch/helpers/scroll_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,8 @@ def initialize(client, index, body, scroll = '1m')
# @yieldparam document [Hash] yields a document found in the search hits.
#
def each(&block)
@docs = []
@scroll_id = nil
refresh_docs
for doc in @docs do
refresh_docs
yield doc
until (docs = results).empty?
docs.each(&block)
end
clear
end
Expand All @@ -70,25 +66,19 @@ def results
#
def clear
@client.clear_scroll(body: { scroll_id: @scroll_id }) if @scroll_id
@docs = []
@scroll_id = nil
end

private

def refresh_docs
@docs ||= []
@docs << results
@docs.flatten!
end

def initial_search
response = @client.search(index: @index, scroll: @scroll, body: @body)
@scroll_id = response['_scroll_id']
response['hits']['hits']
end

def scroll_request
@client.scroll(body: {scroll: @scroll, scroll_id: @scroll_id})['hits']['hits']
@client.scroll(body: { scroll: @scroll, scroll_id: @scroll_id })['hits']['hits']
end
end
end
Expand Down
2 changes: 1 addition & 1 deletion elasticsearch/lib/elasticsearch/version.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@
# under the License.

module Elasticsearch
VERSION = '8.17.0'.freeze
VERSION = '8.17.1'.freeze
end
108 changes: 55 additions & 53 deletions elasticsearch/spec/integration/helpers/scroll_helper_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,64 +18,66 @@
require 'elasticsearch/helpers/scroll_helper'

context 'Elasticsearch client helpers' do
let(:index) { 'books' }
let(:body) { { size: 12, query: { match_all: {} } } }
let(:scroll_helper) { Elasticsearch::Helpers::ScrollHelper.new(client, index, body) }
context 'ScrollHelper' do
let(:index) { 'books' }
let(:body) { { size: 12, query: { match_all: {} } } }
let(:scroll_helper) { Elasticsearch::Helpers::ScrollHelper.new(client, index, body) }

before do
documents = [
{ index: { _index: index, data: {name: "Leviathan Wakes", "author": "James S.A. Corey", "release_date": "2011-06-02", "page_count": 561} } },
{ index: { _index: index, data: {name: "Hyperion", "author": "Dan Simmons", "release_date": "1989-05-26", "page_count": 482} } },
{ index: { _index: index, data: {name: "Dune", "author": "Frank Herbert", "release_date": "1965-06-01", "page_count": 604} } },
{ index: { _index: index, data: {name: "Dune Messiah", "author": "Frank Herbert", "release_date": "1969-10-15", "page_count": 331} } },
{ index: { _index: index, data: {name: "Children of Dune", "author": "Frank Herbert", "release_date": "1976-04-21", "page_count": 408} } },
{ index: { _index: index, data: {name: "God Emperor of Dune", "author": "Frank Herbert", "release_date": "1981-05-28", "page_count": 454} } },
{ index: { _index: index, data: {name: "Consider Phlebas", "author": "Iain M. Banks", "release_date": "1987-04-23", "page_count": 471} } },
{ index: { _index: index, data: {name: "Pandora's Star", "author": "Peter F. Hamilton", "release_date": "2004-03-02", "page_count": 768} } },
{ index: { _index: index, data: {name: "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} } },
{ index: { _index: index, data: {name: "A Fire Upon the Deep", "author": "Vernor Vinge", "release_date": "1992-06-01", "page_count": 613} } },
{ index: { _index: index, data: {name: "Ender's Game", "author": "Orson Scott Card", "release_date": "1985-06-01", "page_count": 324} } },
{ index: { _index: index, data: {name: "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} } },
{ index: { _index: index, data: {name: "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} } },
{ index: { _index: index, data: {name: "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} } },
{ index: { _index: index, data: {name: "Foundation", "author": "Isaac Asimov", "release_date": "1951-06-01", "page_count": 224} } },
{ index: { _index: index, data: {name: "The Giver", "author": "Lois Lowry", "release_date": "1993-04-26", "page_count": 208} } },
{ index: { _index: index, data: {name: "Slaughterhouse-Five", "author": "Kurt Vonnegut", "release_date": "1969-06-01", "page_count": 275} } },
{ index: { _index: index, data: {name: "The Hitchhiker's Guide to the Galaxy", "author": "Douglas Adams", "release_date": "1979-10-12", "page_count": 180} } },
{ index: { _index: index, data: {name: "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470} } },
{ index: { _index: index, data: {name: "Neuromancer", "author": "William Gibson", "release_date": "1984-07-01", "page_count": 271} } },
{ index: { _index: index, data: {name: "The Handmaid's Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} } },
{ index: { _index: index, data: {name: "Starship Troopers", "author": "Robert A. Heinlein", "release_date": "1959-12-01", "page_count": 335} } },
{ index: { _index: index, data: {name: "The Left Hand of Darkness", "author": "Ursula K. Le Guin", "release_date": "1969-06-01", "page_count": 304} } },
{ index: { _index: index, data: {name: "The Moon is a Harsh Mistress", "author": "Robert A. Heinlein", "release_date": "1966-04-01", "page_count": 288 } } }
]
client.bulk(body: documents, refresh: 'wait_for')
end

after do
client.indices.delete(index: index)
end
before do
documents = [
{ index: { _index: index, data: {name: "Leviathan Wakes", "author": "James S.A. Corey", "release_date": "2011-06-02", "page_count": 561} } },
{ index: { _index: index, data: {name: "Hyperion", "author": "Dan Simmons", "release_date": "1989-05-26", "page_count": 482} } },
{ index: { _index: index, data: {name: "Dune", "author": "Frank Herbert", "release_date": "1965-06-01", "page_count": 604} } },
{ index: { _index: index, data: {name: "Dune Messiah", "author": "Frank Herbert", "release_date": "1969-10-15", "page_count": 331} } },
{ index: { _index: index, data: {name: "Children of Dune", "author": "Frank Herbert", "release_date": "1976-04-21", "page_count": 408} } },
{ index: { _index: index, data: {name: "God Emperor of Dune", "author": "Frank Herbert", "release_date": "1981-05-28", "page_count": 454} } },
{ index: { _index: index, data: {name: "Consider Phlebas", "author": "Iain M. Banks", "release_date": "1987-04-23", "page_count": 471} } },
{ index: { _index: index, data: {name: "Pandora's Star", "author": "Peter F. Hamilton", "release_date": "2004-03-02", "page_count": 768} } },
{ index: { _index: index, data: {name: "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} } },
{ index: { _index: index, data: {name: "A Fire Upon the Deep", "author": "Vernor Vinge", "release_date": "1992-06-01", "page_count": 613} } },
{ index: { _index: index, data: {name: "Ender's Game", "author": "Orson Scott Card", "release_date": "1985-06-01", "page_count": 324} } },
{ index: { _index: index, data: {name: "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} } },
{ index: { _index: index, data: {name: "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} } },
{ index: { _index: index, data: {name: "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} } },
{ index: { _index: index, data: {name: "Foundation", "author": "Isaac Asimov", "release_date": "1951-06-01", "page_count": 224} } },
{ index: { _index: index, data: {name: "The Giver", "author": "Lois Lowry", "release_date": "1993-04-26", "page_count": 208} } },
{ index: { _index: index, data: {name: "Slaughterhouse-Five", "author": "Kurt Vonnegut", "release_date": "1969-06-01", "page_count": 275} } },
{ index: { _index: index, data: {name: "The Hitchhiker's Guide to the Galaxy", "author": "Douglas Adams", "release_date": "1979-10-12", "page_count": 180} } },
{ index: { _index: index, data: {name: "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470} } },
{ index: { _index: index, data: {name: "Neuromancer", "author": "William Gibson", "release_date": "1984-07-01", "page_count": 271} } },
{ index: { _index: index, data: {name: "The Handmaid's Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} } },
{ index: { _index: index, data: {name: "Starship Troopers", "author": "Robert A. Heinlein", "release_date": "1959-12-01", "page_count": 335} } },
{ index: { _index: index, data: {name: "The Left Hand of Darkness", "author": "Ursula K. Le Guin", "release_date": "1969-06-01", "page_count": 304} } },
{ index: { _index: index, data: {name: "The Moon is a Harsh Mistress", "author": "Robert A. Heinlein", "release_date": "1966-04-01", "page_count": 288 } } }
]
client.bulk(body: documents, refresh: 'wait_for')
end

it 'instantiates a scroll helper' do
expect(scroll_helper).to be_an_instance_of Elasticsearch::Helpers::ScrollHelper
end
after do
client.indices.delete(index: index)
end

it 'searches an index' do
my_documents = []
while !(documents = scroll_helper.results).empty?
my_documents << documents
it 'instantiates a scroll helper' do
expect(scroll_helper).to be_an_instance_of Elasticsearch::Helpers::ScrollHelper
end

expect(my_documents.flatten.size).to eq 24
end
it 'searches an index' do
my_documents = []
while !(documents = scroll_helper.results).empty?
my_documents << documents
end

it 'uses enumerable' do
count = 0
scroll_helper.each { |a| count += 1 }
expect(count).to eq 24
expect(scroll_helper).to respond_to(:count)
expect(scroll_helper).to respond_to(:reject)
expect(scroll_helper).to respond_to(:uniq)
expect(scroll_helper.map { |a| a['_id'] }.uniq.count).to eq 24
expect(my_documents.flatten.size).to eq 24
end

it 'uses enumerable' do
count = 0
scroll_helper.each { count += 1 }
expect(count).to eq 24
expect(scroll_helper).to respond_to(:count)
expect(scroll_helper).to respond_to(:reject)
expect(scroll_helper).to respond_to(:uniq)
expect(scroll_helper.map { |a| a['_id'] }.uniq.count).to eq 24
end
end
end
Loading