Skip to content

FAISSCache._load_index calls read_index() on every invocation - no in-memory caching of the loaded index object #24

@gkennos

Description

@gkennos

Is there an existing issue for this?

  • I have searched the existing issues

Bug summary

For a 4.8 GB HNSW index this costs ~2 s of disk I/O per query, making FAISS search indistinguishable from a flat scan in practice.

Related issues

  • ef_search from HNSWIndexConfig is never applied at search time
  • the FAISS default (16) happens to match the config default, so not an issue for now but should be addressed alongside this fix.
  • metadata.npz is similarly reloaded on each call via _load_meta.

Proposed fix

Add an instance-level cache keyed on (metric_type, index_config). A single-slot cache (last-used tuple) is sufficient for sequential access patterns; a dict is safer for callers that alternate between index configs on the same instance.

def __init__(self, ...):
    ...
    self._index_cache: dict[tuple, faiss.Index] = {}

def _load_index(self, metric_type, index_config):
    key = (metric_type, index_config)
    if key not in self._index_cache:
        path = self._faiss_path(metric_type, index_config)
        idx = faiss.read_index(str(path))
        if isinstance(index_config, HNSWIndexConfig):
            inner = idx
            while hasattr(inner, 'index') and not hasattr(inner, 'hnsw'):
                inner = inner.index
            if hasattr(inner, 'hnsw'):
                inner.hnsw.efSearch = index_config.ef_search
        self._index_cache[key] = idx
    return self._index_cache[key]

Note: the index object is wrapped in IndexIDMap on disk, so ef_search must be applied to the inner index after unwrapping.

At larger-scale coverage (~3.5M vectors, 1024-dim float32) the HNSW index would be ~15 GB. Holding that in process RAM via instance cache may not be acceptable in all deployments

see the FAISS on-disk indexes wiki re IO_FLAG_MMAP_IFC as a lower-RSS alternative for flat/HNSW indexes

Code for reproduction

## Root cause

`FAISSCache._load_index` (`storage/faiss/faiss_cache.py`):


def _load_index(self, metric_type, index_config):
    path = self._faiss_path(metric_type, index_config)
    index = faiss.read_index(str(path))   # called fresh every time
    return index

Error messages

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions