Feat/multimodal knowledge store #489

Izukimat · 2025-07-13T14:24:13Z

Summary

What does this PR do? Please provide a brief summary of the changes introduced.

Expand BaseKnowledgeStore and RAGSystem to support multimodal retrieval with separate collections strategy

Description

This PR expands the knowledge store layer to support multimodal embeddings and implements the separate collections strategy for multimodal RAG. Key changes include:

BaseKnowledgeStore Changes:

Added MultiModalEmbedding type supporting text, image, audio, and video embeddings
Updated retrieve() method signature to accept QueryEmbedding (Union of list[float] or MultiModalEmbedding)
Added new retrieve_by_modality() method for modality-specific collection queries
Updated both sync and async base classes

RAG System Changes:

Updated method signatures to accept str | Query for multimodal input support
Added _prepare_modality_embeddings() method to extract embeddings by modality from retriever tensors
Enhanced retrieve() method to query separate collections for each modality and merge results
Improved _format_context() with configuration-driven approach for modality-specific formatting
Added robust tensor dimension handling (1D, 2D, >2D cases)
Maintained full backward compatibility with existing text-only workflows
Assuming separate collections strategy: text collection, image collection, audio collection, video collection

Any information reviewers should be aware of:

This is a draft PR - concrete implementations (InMemoryKnowledgeStore, QdrantKnowledgeStore) and comprehensive tests are not yet included

Checklist

Before submitting your PR, please check off the following:

My code follows the existing style and conventions
I've run linting (make lint)
I've added/updated relevant documentation (will add in final version)
I've added/updated tests as needed (planned for final version)
I've verified integration with existing tools (HuggingFace, LlamaIndex, LangChain, etc. if applicable) (public interface unchanged, should work)
I've added an entry to the CHANGELOG.md (if applicable) (will add for final version)

nerdai

Will take another look in a bit!

nerdai · 2025-07-14T02:01:16Z

src/fed_rag/base/knowledge_store.py

+
+
+# Union type for backward compatibility
+QueryEmbedding = Union[list[float], MultiModalEmbedding]


Ohh, I kind of like how you did this better. Subtle difference, but I think cleaner than what I got.

Izuki Matsuba added 2 commits July 13, 2025 08:49

base_knowledge_store update

5986d22

update rag_system

762f1d5

Izukimat requested a review from nerdai July 13, 2025 14:24

nerdai reviewed Jul 14, 2025

View reviewed changes

nerdai linked an issue Jul 18, 2025 that may be closed by this pull request

[Feature] Expand BaseKnowledgeStore to accept multi-modal inputs when retrieving. #493

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/multimodal knowledge store #489

Feat/multimodal knowledge store #489

Uh oh!

Izukimat commented Jul 13, 2025

Uh oh!

nerdai left a comment

Uh oh!

nerdai Jul 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		# Union type for backward compatibility
		QueryEmbedding = Union[list[float], MultiModalEmbedding]

Feat/multimodal knowledge store #489

Are you sure you want to change the base?

Feat/multimodal knowledge store #489

Uh oh!

Conversation

Izukimat commented Jul 13, 2025

Summary

Description

Checklist

Uh oh!

nerdai left a comment

Choose a reason for hiding this comment

Uh oh!

nerdai Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants