You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RAGnarōk is a VS Code extension that implements a full Retrieval-Augmented Generation (RAG) pipeline, exposing a Copilot-compatible language model tool for agentic query processing. It supports multiple embedding backends, retrieval strategies, iterative refinement with LLM-powered gap analysis, and per-topic vector stores backed by LanceDB.
Core Capabilities
Capability
Description
Multi-format ingestion
PDF, Markdown, HTML, plain text, GitHub repos, web pages
Semantic chunking
Structure-aware splitting with heading metadata preservation
Dual embedding backends
HuggingFace Transformers.js (local ONNX) or VS Code LM API
4 retrieval strategies
Vector, Hybrid, Ensemble (RRF), BM25
Agentic query planning
LLM-powered query decomposition with heuristic fallback
Iterative refinement
Gap analysis → follow-up query generation → convergence detection
Per-topic isolation
Independent vector stores, document caches, and metadata per topic
2. High-Level Architecture
graph TB
subgraph "VS Code Host"
UI[Tree Views & Commands]
Config[Configuration Panel]
end
subgraph "Extension Core"
EXT[extension.ts<br/>Activation & Wiring]
CMD[CommandHandler<br/>Command Registry]
TOOL[RAGTool<br/>Copilot LM Tool]
end
subgraph "Agentic Layer"
AGENT[RAGAgent<br/>Orchestrator]
QP[QueryPlannerAgent<br/>Decomposition]
LLM[VSCodeLLM<br/>LangChain Wrapper]
end
subgraph "Retrieval Layer"
VR[VectorRetriever<br/>Semantic Search]
KR[KeywordRetriever<br/>BM25 + Keyword Scoring]
HR[HybridRetriever<br/>Weighted Score Fusion]
ER[EnsembleRetriever<br/>RRF Rank Fusion]
end
subgraph "Embedding Layer"
ES[EmbeddingService<br/>Backend Router]
HF[HuggingFaceBackend<br/>Local ONNX/WASM]
VLM[VscodeLmBackend<br/>Proposed API]
MR[ModelRegistry<br/>Model Discovery]
end
subgraph "Storage Layer"
TM[TopicManager<br/>Topic Lifecycle]
DP[DocumentPipeline<br/>Ingestion Orchestrator]
VSF[VectorStoreFactory<br/>LanceDB Manager]
DLF[DocumentLoaderFactory<br/>Multi-Format Orchestrator]
SC[SemanticChunker<br/>Structure-Aware Splitter]
end
subgraph "Persistence"
LANCE[(LanceDB<br/>Vector Tables)]
META[(JSON Files<br/>Topics Index & Metadata)]
end
UI --> CMD
Config --> CMD
EXT --> CMD
EXT --> TOOL
EXT --> TM
EXT --> ES
TOOL --> AGENT
AGENT --> QP
QP --> LLM
AGENT --> HR
AGENT --> ER
AGENT --> BM
HR --> ES
ER --> BM
ES --> HF
ES --> VLM
ES --> MR
CMD --> TM
TM --> DP
TM --> VSF
DP --> DLF
DP --> SC
DP --> ES
DP --> VSF
VSF --> LANCE
TM --> META
classDef core fill:#4a9eff,stroke:#2d7cd6,color:#fff
classDef agent fill:#ff6b6b,stroke:#d64545,color:#fff
classDef retrieval fill:#51cf66,stroke:#37b24d,color:#fff
classDef embedding fill:#ffd43b,stroke:#f59f00,color:#333
classDef storage fill:#845ef7,stroke:#7048e8,color:#fff
classDef persist fill:#868e96,stroke:#495057,color:#fff
class EXT,CMD,TOOL core
class AGENT,QP,LLM agent
class HR,ER,BM retrieval
class ES,HF,VLM,MR embedding
class TM,DP,VSF,DLF,SC storage
class LANCE,META persist
Loading
3. Document Ingestion Pipeline
The ingestion pipeline transforms raw files into searchable vector embeddings stored in LanceDB.
Each pipeline execution returns a PipelineResult containing:
Field
Description
stages
Boolean success per stage (loading, chunking, embedding, storing)
metadata.originalDocuments
Count of source documents loaded
metadata.chunksCreated
Total chunks after splitting
metadata.chunksEmbedded
Chunks successfully embedded
metadata.chunksStored
Chunks written to LanceDB
metadata.stageTimings
Per-stage timing breakdown
Loader Module Architecture
The document loading system uses a modular architecture with a shared DocumentLoader interface. DocumentLoaderFactory is a thin orchestrator that delegates to format-specific loaders.
classDiagram
class DocumentLoader {
<<interface>>
+load(filePath, options) Promise~LangChainDocument[]~
}
class TextDocumentLoader {
+load(filePath, options) Promise~LangChainDocument[]~
}
class MarkdownDocumentLoader {
+load(filePath, options) Promise~LangChainDocument[]~
}
class HtmlDocumentLoader {
+load(filePath, options) Promise~LangChainDocument[]~
}
class PdfDocumentLoader {
+load(filePath, options) Promise~LangChainDocument[]~
}
class GithubDocumentLoader {
+load(url, options) Promise~LangChainDocument[]~
}
class WebDocumentLoader {
+load(url, options) Promise~LangChainDocument[]~
}
DocumentLoader <|.. TextDocumentLoader
DocumentLoader <|.. MarkdownDocumentLoader
DocumentLoader <|.. HtmlDocumentLoader
DocumentLoader <|.. PdfDocumentLoader
DocumentLoader <|.. GithubDocumentLoader
DocumentLoader <|.. WebDocumentLoader
Loading
Module
File
Method
TextDocumentLoader
src/loaders/textLoader.ts
UTF-8 file read, returns single document
MarkdownDocumentLoader
src/loaders/markdownLoader.ts
Text read with isMarkdown and preserveStructure metadata
HtmlDocumentLoader
src/loaders/htmlLoader.ts
Regex-based: strips <script>, <style>, comments, all tags; decodes HTML entities; normalizes whitespace
PdfDocumentLoader
src/loaders/pdfLoader.ts
Delegates to LangChain PDFLoader (pdf-parse), optional page splitting
GithubDocumentLoader
src/loaders/githubLoader.ts
Delegates to LangChain GithubRepoLoader, supports GitHub Enterprise
sequenceDiagram
participant VSC as VS Code
participant EXT as extension.ts
participant TM as TopicManager
participant ES as EmbeddingService
participant CMD as CommandHandler
participant TOOL as RAGTool
participant TV as TreeViews
VSC->>EXT: activate(context)
EXT->>TM: getInstance(context)
TM->>TM: init() [load topics index]
EXT->>ES: getInstance()
ES->>ES: resolveBackend() [background]
EXT->>CMD: registerCommands(context)
EXT->>TV: new TopicTreeDataProvider()
EXT->>TV: new ConfigTreeDataProvider()
EXT->>TOOL: RAGTool.register(context)
TOOL->>VSC: vscode.lm.registerTool()
EXT->>VSC: setContext('ragnarok.loaded', true)
EXT->>VSC: setContext('ragnarok.hasTopics', count > 0)
Loading
9.2 Document Ingestion
sequenceDiagram
participant U as User
participant CMD as CommandHandler
participant TM as TopicManager
participant DP as DocumentPipeline
participant DLF as DocumentLoaderFactory
participant SC as SemanticChunker
participant ES as EmbeddingService
participant VSF as VectorStoreFactory
participant DB as LanceDB
U->>CMD: Add Document command
CMD->>TM: addDocuments(topicId, filePaths)
TM->>DP: processDocuments(filePaths, topicId, options)
rect rgb(240, 248, 255)
Note over DP,DLF: Stage 1: Loading
DP->>DLF: loadDocument(options) per file
DLF->>DLF: detectFileType() → strategy
DLF-->>DP: LoadedDocument[] with metadata
end
rect rgb(240, 255, 240)
Note over DP,SC: Stage 2: Chunking
DP->>SC: chunkDocuments(documents, chunkingOptions)
SC->>SC: determineStrategy() → markdown|recursive|code
SC->>SC: split + enrichChunksInBatches()
SC-->>DP: ChunkingResult {chunks, stats}
end
rect rgb(255, 248, 240)
Note over DP,ES: Stage 3: Embedding
DP->>ES: embedBatch(chunkTexts, progressCallback)
ES->>ES: executeWithFallback(embed)
ES-->>DP: number[][] vectors
end
rect rgb(248, 240, 255)
Note over DP,DB: Stage 4: Storage
DP->>VSF: addDocuments(topicId, chunks)
VSF->>DB: LanceDB.fromDocuments() or addDocuments()
VSF->>VSF: saveStoreMetadata(topicId)
VSF-->>DP: stored
end
DP-->>TM: PipelineResult
TM->>TM: Update topic index + document cache
TM-->>CMD: AddDocumentResult
Loading
9.3 Agentic Query Execution
sequenceDiagram
participant COP as Copilot
participant TOOL as RAGTool
participant TM as TopicManager
participant RA as RAGAgent
participant QP as QueryPlannerAgent
participant LLM as VSCodeLLM
participant RET as Retriever
COP->>TOOL: executeQuery({topic, query, topK})
TOOL->>TM: findBestMatchingTopic(topic)
TM-->>TOOL: Topic (exact|similar|fallback)
TOOL->>TOOL: getOrCreateAgent(topicId)
TOOL->>RA: query(query, agenticOptions)
rect rgb(255, 245, 245)
Note over RA,QP: Phase 1: Planning
RA->>QP: createPlan(query, options)
QP->>QP: analyzeComplexityScore()
QP->>QP: heuristicPlan()
alt Complex + LLM available
QP->>LLM: refinePlanWithLLM()
LLM-->>QP: Zod-validated plan
end
QP-->>RA: QueryPlan {subQueries, complexity, strategy}
end
rect rgb(245, 255, 245)
Note over RA,RET: Phase 2: Retrieval
loop Each sub-query
RA->>RET: search(subQuery, {k, strategy})
RET-->>RA: RetrievalResult[]
end
end
rect rgb(245, 245, 255)
Note over RA,LLM: Phase 3: Iterative Refinement
alt complex query (not simple)
loop Until converged or maxIterations
RA->>RA: calculateAvgConfidence()
alt confidence < threshold
RA->>RA: analyzeGaps()
RA->>QP: generateFollowUpPlan(gaps)
QP->>LLM: Refine follow-ups
LLM-->>QP: Follow-up queries
QP-->>RA: Follow-up plan
RA->>RET: Execute follow-ups
RET-->>RA: Additional results
RA->>RA: Merge + deduplicate
end
end
end
end
RA->>RA: Final dedup + re-rank + topK
RA-->>TOOL: RAGResult
TOOL->>TOOL: Format RAGQueryResult + agenticMetadata
TOOL-->>COP: JSON response
Loading
9.4 Embedding Backend Fallback
sequenceDiagram
participant C as Caller
participant ES as EmbeddingService
participant VLM as VscodeLmBackend
participant HF as HuggingFaceBackend
C->>ES: embed(text)
ES->>ES: ensureBackend()
alt activeBackendType = vscodeLM
ES->>VLM: embed(text)
alt Success
VLM-->>ES: number[]
ES-->>C: number[]
else Failure + auto mode
VLM--xES: Error
ES->>ES: shouldFallbackToHuggingFace()
ES->>HF: initialize()
ES->>ES: activeBackendType = 'huggingface'
ES->>HF: embed(text)
HF-->>ES: number[]
ES-->>C: number[]
Note over ES: Show warning to user
end
else activeBackendType = huggingface
ES->>HF: embed(text)
HF-->>ES: number[]
ES-->>C: number[]
end