You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're seeing poor filtering results from the Solr index.
If I try to search by an individual GCOOS dataset id (see this search for 'Data for ioos-station-wmo-42400'), I get essentially a full list of datasets returned (~76.068 total datasets). The dataset order does appear to be sorted at least (most relevant results at top), but there is essentially no filtering happening on the count in the results set.
Another example, searching for osu592-20230524T1813-delayed and org=Glider DAC without and with quotes changes results from 6877 datasets to 2 datasets.
Results are more reasonable for other simple phrase searches like 'Mote' or 'NERACOOS':
It's very likely how the free-text search is configured in the stock CKAN schema
If you remove the "T" from the ISO8601 date strings in the search you will get much more reasonable results. Letters adjacent to numbers appear to be getting tokenized separately. Quoting will also work, but this may not be immediately obvious.
Separating this issue out from #252
We're seeing poor filtering results from the Solr index.
If I try to search by an individual GCOOS dataset id (see this search for 'Data for ioos-station-wmo-42400'), I get essentially a full list of datasets returned (~76.068 total datasets). The dataset order does appear to be sorted at least (most relevant results at top), but there is essentially no filtering happening on the count in the results set.
Testing today on the simple search string: 'M01':
Without quotes: M01 yields ~60,590 results: https://data.ioos.us/dataset/?q=M01&sort=score+desc%2C+metadata_modified+desc&ext_timerange_start=&ext_timerange_end=&ext_min_depth=&ext_max_depth=&ext_bbox=
With quotes: "M01" yields: 33 results: https://data.ioos.us/dataset/?q=%22M01%22&sort=score+desc%2C+metadata_modified+desc&ext_timerange_start=&ext_timerange_end=&ext_min_depth=&ext_max_depth=&ext_bbox=
Another example, searching for osu592-20230524T1813-delayed and org=Glider DAC without and with quotes changes results from 6877 datasets to 2 datasets.
Results are more reasonable for other simple phrase searches like 'Mote' or 'NERACOOS':
Search for 'NERACOOS' ~397 results: https://data.ioos.us/dataset/?q=NERACOOS&sort=score+desc%2C+metadata_modified+desc&ext_timerange_start=&ext_timerange_end=&ext_min_depth=&ext_max_depth=&ext_bbox=
The text was updated successfully, but these errors were encountered: