Insight Fusion is an advanced research-automation agent designed to synthesize deep insights from a hybrid of user-uploaded documents and real-time online sources. Built on LangChain and LangGraph, it employs an iterative, self-correcting workflow to ensure retrieval accuracy, claim validation, and evidence-grounded summarization.
- Hybrid Context Retrieval: Seamlessly blends proprietary data (PDFs/Text) with Tavily/SerpAPI web search results.
- Iterative Refinement: Uses a cyclic LangGraph workflow to critique and refine search queries until sufficient evidence is gathered.
- Fact-Checking & Validation: Dedicated graph nodes effectively cross-reference claims against retrieved sources.
- Streamlit Interface: A clean, modern UI for interacting with the agent and visualizing the synthesis process.
The core of the agent is a state machine defined in src/agent.py. The workflow proceeds as follows:
- Query Decomposition: Breaks down complex user requests into sub-questions.
- Hybrid Search: Fetches documents from vector store and web.
- Relevance Graduation: Evaluates retrieved documents for relevance.
- Generation & Hallucination Check: Generates an answer and verifies it against the documents.
- Clone the repository.
- Install dependencies:
pip install -r requirements.txt
- Set up your environment variables:
cp .env.example .env # Add OPENAI_API_KEY, TAVILY_API_KEY etc. - Run the Streamlit app:
streamlit run app.py
- Orchestration: LangChain, LangGraph
- LLM: GPT-4o / Claude 3.5 Sonnet
- Vector Store: ChromaDB / FAISS
- Frontend: Streamlit
- Search: Tavily API
- Add multi-modal support (charts/images).
- Implement persistent memory/checkpointing with SQLite.
- Add export to PDF feature for research reports.