Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] LightRAG file selection #561

Open
newbie-Li opened this issue Dec 12, 2024 · 2 comments · May be fixed by #627
Open

[BUG] LightRAG file selection #561

newbie-Li opened this issue Dec 12, 2024 · 2 comments · May be fixed by #627
Labels
bug Something isn't working

Comments

@newbie-Li
Copy link

Description

select "Search All" or select any group in "Search in Files", empty file_ids will be sent to retriever pipeline

libs\ktem\ktem\index\file\graph\graph_index.GraphRAGIndex -> get_retriever_pipelines

import:

from typing import Any
from sqlalchemy.orm import Session
import json

from ktem.index.file import FileIndex
from ktem.db.models import engine

from ..base import BaseFileIndexIndexing, BaseFileIndexRetriever
from .pipelines import GraphRAGIndexingPipeline, GraphRAGRetrieverPipeline

replace is_all, sel_ids, _ = selected with:

is_all, sel_ids, _ = selected
        if is_all == "all":
            Index = self._resources.get("Index")
            with Session(engine) as session:
                all_id = session.query(Index.source_id).filter(Index.relation_type == "graph").all()
                file_ids = [i[0] for i in all_id]
        else:
            file_ids = []
            for item in sel_ids:
                if item.startswith("["):
                    group_file_ids = json.loads(item)
                    file_ids.extend(group_file_ids)
                else:
                    file_ids.append(item)

Reproduction steps

1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

Screenshots

![image](https://github.com/user-attachments/assets/907929ab-b59b-443d-a795-8896418f361a)

Logs

No response

Browsers

No response

OS

No response

Additional information

No response

@newbie-Li newbie-Li added the bug Something isn't working label Dec 12, 2024
@newbie-Li
Copy link
Author

not only a simple selection bug
each time upload files, will create a unique id, then create a folder by id, and store light rag data in this folder.
when select multi files, pick the first file id, query the linked unique id, and the create light query.

when I have 10 PDF, upload 5 at first and then another 5. It seems that if i select all files, will only search in first 5 PDF

@eddprogrammer
Copy link
Contributor

I am seeing same thing, it seems like LightRag will search against ALL documents that are in the same indexes no matter what you selected and will only search against documents in that same index even if you select files from multiple indexes.
I ran different test scenarios with two markup resume files

  1. John_Doe_Resume.txt
  2. Jane_Smith_Resume.txt

Drag & Drop each file separately to LightRag file collection

  1. Two separate indexes are created under ktem_app_data\user_data\files\lightrag\
  2. Query will only search against one index at a time even if both files are selected. And "Search All" button doesn't work. It will complaint no documents are selected.
  3. When selecting multiple files in the drop down, only the first one selected will be included in the search. The others will be ignored.

Drag & Drop both files at the same time to be indexed

  1. One index is created under ktem_app_data\user_data\files\lightrag\
  2. Query will search both documents
  3. Query will search both documents EVEN if I only selected one file in the drop down.

I believe LightRag can do incremental index, i.e. adding new documents will update the existing index instead of creating new one. Will be great if that can be implemented.

varunsharma27 added a commit to varunsharma27/kotaemon that referenced this issue Jan 13, 2025
@varunsharma27 varunsharma27 linked a pull request Jan 13, 2025 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants