Case-Insensitive Sorting for Localized Fields
Problem
Sorting on string fields (e.g. name, acronym) is case-sensitive by default in Elasticsearch 8 — "Zebra" sorts before "alpha" due to byte ordering. This affects Network, Study, Dataset, and Variable entities.
Solution
Add a .sort sub-field with a lowercase_normalizer to sortable string fields in the ES index mapping. At query time, RQLFieldResolver routes sort requests to the .sort sub-field when it exists, detected dynamically via IndexFieldMapping.isSortable().
Changes
mica-search-es8
elasticsearch.yml: add lowercase_normalizer to the analysis.normalizer block.
AbstractIndexConfiguration: add createMappingWithAnalyzersAndSort (non-localized) and createLocalizedMappingWithAnalyzersAndSort (localized) mapping methods, and createMappingWithAnalyzersAndSortNonLocalized wrapper.
NetworkIndexConfiguration, StudyIndexConfiguration, DatasetIndexConfiguration: exclude name/acronym from addTaxonomyFields and map them explicitly with createLocalizedMappingWithAnalyzersAndSort.
VariableIndexConfiguration: use createMappingWithAnalyzersAndSortNonLocalized for name.
ESIndexer.IndexFieldMappingImpl: implement isSortable(fieldName) using JSONPath to check for a .sort sub-field in the live ES mapping.
RQLQuery.RQLBuilder: delegate resolveFieldForSort to RQLFieldResolver, used in RQLSortBuilder.processArgument.
mica-spi
IndexFieldMapping: add isSortable(String fieldName).
RQLFieldResolver: add resolveFieldForSort public method; update FieldData.Builder.sortable() to call indexFieldMapping.isSortable() instead of checking the localized vocabulary attribute, covering both localized and non-localized sortable fields.
Notes
- Re-indexing required: mapping changes do not apply to existing indices. Drop and rebuild affected indices via the Mica admin re-index function after deploying.
mica-search-os2: the same gap exists in the OpenSearch plugin; RQLFieldResolver changes are shared via the SPI.
- Future: the current hardcoded approach for built-in fields (
name, acronym, label) is a temporary measure. The plan is to introduce a sortable attribute on taxonomy vocabularies as the authoritative source for which fields get a .sort sub-field in the ES mapping. A companion internal attribute will be introduced to prevent system-managed vocabularies from appearing in the Mica Administration UI. Once implemented, custom fields defined in user taxonomies will automatically get case-insensitive sorting by setting sortable: true on their vocabulary — with no code changes required.
Case-Insensitive Sorting for Localized Fields
Problem
Sorting on string fields (e.g.
name,acronym) is case-sensitive by default in Elasticsearch 8 —"Zebra"sorts before"alpha"due to byte ordering. This affects Network, Study, Dataset, and Variable entities.Solution
Add a
.sortsub-field with alowercase_normalizerto sortable string fields in the ES index mapping. At query time,RQLFieldResolverroutes sort requests to the.sortsub-field when it exists, detected dynamically viaIndexFieldMapping.isSortable().Changes
mica-search-es8elasticsearch.yml: addlowercase_normalizerto theanalysis.normalizerblock.AbstractIndexConfiguration: addcreateMappingWithAnalyzersAndSort(non-localized) andcreateLocalizedMappingWithAnalyzersAndSort(localized) mapping methods, andcreateMappingWithAnalyzersAndSortNonLocalizedwrapper.NetworkIndexConfiguration,StudyIndexConfiguration,DatasetIndexConfiguration: excludename/acronymfromaddTaxonomyFieldsand map them explicitly withcreateLocalizedMappingWithAnalyzersAndSort.VariableIndexConfiguration: usecreateMappingWithAnalyzersAndSortNonLocalizedforname.ESIndexer.IndexFieldMappingImpl: implementisSortable(fieldName)using JSONPath to check for a.sortsub-field in the live ES mapping.RQLQuery.RQLBuilder: delegateresolveFieldForSorttoRQLFieldResolver, used inRQLSortBuilder.processArgument.mica-spiIndexFieldMapping: addisSortable(String fieldName).RQLFieldResolver: addresolveFieldForSortpublic method; updateFieldData.Builder.sortable()to callindexFieldMapping.isSortable()instead of checking thelocalizedvocabulary attribute, covering both localized and non-localized sortable fields.Notes
mica-search-os2: the same gap exists in the OpenSearch plugin;RQLFieldResolverchanges are shared via the SPI.name,acronym,label) is a temporary measure. The plan is to introduce asortableattribute on taxonomy vocabularies as the authoritative source for which fields get a.sortsub-field in the ES mapping. A companioninternalattribute will be introduced to prevent system-managed vocabularies from appearing in the Mica Administration UI. Once implemented, custom fields defined in user taxonomies will automatically get case-insensitive sorting by settingsortable: trueon their vocabulary — with no code changes required.