Status: ✅ V7 OPERATIONAL + RAG Quality Verification Complete | Version: v7.1.0 | Last Updated: October 10, 2025
Moj AI is a multi-tenant SaaS application providing AI-powered assistance for Slovenian building legislation. The V7 orchestration system is fully operational with complete RAG Quality Verification System.
- ✅ API Key Decryption Fixed: Double decryption bug resolved (Oct 9, 2025)
- ✅ Claude API Working: Legal reasoning tool now receives valid API keys
- ✅ Intelligent Query Analysis: Automatic detection of RAG/internet research needs
- ✅ Admin-First Architecture: Settings applied immediately (no singleton pattern)
- ✅ Multi-Source Intelligence: RAG + Internet Research + LLM knowledge
- ✅ Real-Time Progress: SSE implemented with progress tracking
- ✅ Research-Quality Responses: 10,000-25,000 character responses verified
- ✅ Orchestration Transparency: Full reports with tools used, sources, processing time
- ✅ Quality Score Calculation: 0-100 scoring for all documents
- ✅ Admin Quality Dashboard: Document management with quality badges
- ✅ RAG Testing Interface:
/admin/rag-testingfor query testing - ✅ User Document Upload: Quality validation prevents bad files
- ✅ Enhanced User Chat: Prominent sources display with type icons
- ✅ 100% Processing Success: All test documents processed correctly
- ✅ Authentication: Google OAuth integration
- ✅ Billing: Stripe subscription management
- ✅ Database: PostgreSQL, Weaviate, Redis
- ✅ RAG System: Multimodal document processing
- ✅ Admin UI: Configuration management
- ✅ V7 Orchestration: Operational with Claude Sonnet 4.5
graph TB
User[User Interface] --> Frontend[Next.js Frontend]
Admin[Admin Interface] --> Frontend
Frontend --> SSE[Server-Sent Events]
Frontend --> API[FastAPI Backend]
API --> Auth[Google OAuth]
API --> Stripe[Stripe Integration]
API --> V7[V7 Orchestration Engine]
API --> DB[(PostgreSQL Database)]
API --> Redis[(Redis Cache)]
V7 --> QueryAnalyzer[Query Analyzer]
V7 --> ToolCoordinator[Tool Coordinator]
V7 --> ResponseSynthesizer[Response Synthesizer]
QueryAnalyzer --> Needs{Detect Needs}
Needs --> RAGTool[RAG Search Tool]
Needs --> InternetTool[Internet Research Tool]
Needs --> LegalTool[Legal Reasoning Tool]
RAGTool --> Weaviate[(Weaviate Vector DB)]
InternetTool --> Perplexity[Perplexity API]
LegalTool --> LLMGateway[LLM Gateway]
LLMGateway --> Claude[Anthropic Claude]
LLMGateway --> OpenAI[OpenAI GPT-4]
ToolCoordinator --> RAGTool
ToolCoordinator --> InternetTool
ToolCoordinator --> LegalTool
ResponseSynthesizer --> FinalResponse[Research-Quality Response]
DB --> Conversations[Conversations]
DB --> Messages[Messages]
DB --> Users[Users & Tenants]
DB --> AdminSettings[Admin Settings]
DB --> RAGDocuments[RAG Documents]
SSE --> Progress[Real-time Progress Updates]
- User Interface: Clean chat interface with real-time progress indicators
- Conversation Management: Create, rename, delete, persist conversations
- Admin Panel: Configuration interface for V7 orchestration settings
- Authentication: Google OAuth integration with role-based access
- State Management: React hooks and context for auth, usage, language, settings
- Real-time Features: Server-Sent Events for live progress updates
- Orchestration Report: Display tools used, processing time, confidence scores
- File Upload: Document attachment with RAG integration
- Voice Input: Speech-to-text functionality
- Language Selection: English ↔ Slovenian switching
- Styling: Tailwind CSS with Shadcn/ui components
- REST API: RESTful endpoints for all operations
- Authentication: JWT-based auth with Google OAuth
- Real-time Streaming: Server-Sent Events for progress updates
- V7 Orchestration: Intelligent multi-tool coordination
- Database Integration: PostgreSQL with conversation persistence
- Middleware: CORS, performance monitoring, error handling
- Validation: Pydantic models for request/response validation
Architecture: No singleton pattern - new engine created per request
Components:
- Query Analyzer: Detects RAG, internet research, legal reasoning needs
- Tool Coordinator: Executes tools in correct order with context passing
- Response Synthesizer: Combines results with citations and sources
Tools:
- 🚧 RAG Search Tool: Queries Weaviate for legal documents (Odlok_OPN_MOL_ID.pdf, etc.)
- 🚧 Internet Research Tool: Uses Perplexity for current prices, forms, market data
- 🚧 Legal Reasoning Tool: Uses lead orchestrator (Claude/GPT-4) for analysis
- 🚧 Form Discovery Tool: Finds government forms and applications
- 🚧 Source Validation Tool: Validates sources and citations
- 🚧 Quality Validation Tool: Ensures response quality and completeness
Features:
- 🚧 Intelligent Detection: Automatically detects when RAG/internet research needed
- 🚧 Admin Control: Settings applied immediately (no singleton!)
- 🚧 Real-Time Progress: Specific progress messages ("Searching legal documents...")
- 🚧 Research Quality: 5-10 page responses with inline citations [1], [2], [3]
- 🚧 Orchestration Reports: Shows tools used, processing time, confidence
- 🚧 Multi-Source: Combines RAG + Internet + LLM knowledge
Documentation: See architecture/v7/ for complete specifications
- Configuration Management: Stores orchestration settings in database
- Lead Orchestrator Config: Provider, model, temperature, max_tokens
- Master System Message: Configurable system message for orchestration
- Hot Reload: Settings applied immediately to new queries
- Caching: Redis caching for performance
- Multi-tenant Architecture: Isolated data per tenant
- Conversation Management: All CRUD operations with persistence
- User Management: Authentication and authorization
- Admin Settings: Orchestration configuration storage
- RAG Documents: Document metadata and processing status
- Usage Tracking: Detailed analytics and billing data
- Document Storage: Slovenian legal documents and regulations
- Vector Search: Hybrid search (vector + keyword)
- Collections:
legal_documents(admin),user_documents(user) - Multimodal Support: PDFs (including scanned), DOC/DOCX, text files
- Chunking Strategy: 500-1000 tokens with 100 token overlap
- Metadata: Municipality, area, document type, page numbers
- Current Documents: Odlok_OPN_MOL_ID.pdf (6.0 MB, 9 chunks), 220528_OPN_MOM.pdf (2.0 MB, 80 chunks)
- Provider Abstraction: Unified interface for Anthropic, OpenAI, Perplexity
- Model Selection: Dynamic model selection from admin settings
- Fallback Logic: Handles provider failures gracefully
- Cost Tracking: Tracks token usage and costs per provider
- User Input: Query submitted through frontend chat interface
- Authentication: JWT validation and tenant identification
- Admin Settings Load: Get current orchestration settings from database
- Engine Creation: Create NEW V7 orchestration engine (no singleton!)
- Query Analysis: Analyze query to detect needs (RAG, internet, legal)
- Tool Selection: Select required tools based on analysis
- Tool Execution: Execute tools in sequence with context passing
- RAG Search (if needed): Query Weaviate for legal documents
- Internet Research (if needed): Query Perplexity for current info
- Legal Reasoning: Use lead orchestrator with RAG/internet context
- Response Synthesis: Combine tool results with citations
- Quality Validation: Verify response quality and completeness
- Real-time Progress: Send specific progress updates via SSE
- Database Persistence: Save conversation and orchestration metadata
- Orchestration Report: Return tools used, processing time, confidence
- Admin Login: Authenticated admin access
- Settings Update: Configuration changes via admin panel
- Database Persistence: Settings saved to admin_settings table
- Cache Invalidation: Redis cache cleared for admin settings
- Immediate Effect: Next query uses new settings (no restart needed)
- Validation: Settings validated before application
- JWT Tokens: Secure token-based authentication
- Tenant Isolation: Multi-tenant data separation
- Role-Based Access: Admin vs user permissions
- API Key Management: Secure storage of LLM provider keys
- Encryption: All sensitive data encrypted at rest
- HTTPS: All communications over secure channels
- Input Validation: Comprehensive request validation
- Rate Limiting: Protection against abuse
- Stateless Backend: API servers can be scaled horizontally
- Database Sharding: PostgreSQL can be sharded by tenant
- Cache Layer: Redis for improved performance
- CDN Integration: Static assets served via CDN
- Connection Pooling: Efficient database connections
- Async Processing: Non-blocking I/O operations
- Caching Strategy: Multi-level caching (Redis, in-memory)
- Query Optimization: Efficient database queries
- API Performance: Response times, error rates
- AI Usage: Token consumption, costs per provider
- Database Performance: Query times, connection usage
- User Analytics: Usage patterns, popular queries
- Structured Logging: JSON-formatted logs
- Error Tracking: Comprehensive error reporting
- Audit Trail: All admin actions logged
- Performance Monitoring: Slow query detection
- Local Development: Docker Compose setup
- Hot Reload: Automatic code reloading
- Test Databases: Isolated test data
- Mock Services: LLM provider mocking for testing
- Container Orchestration: Kubernetes or Docker Swarm
- Load Balancing: Multiple API server instances
- Database Clustering: PostgreSQL high availability
- Backup Strategy: Automated backups and recovery
- Performance: Async support and high performance
- Type Safety: Built-in Pydantic validation
- Documentation: Automatic OpenAPI documentation
- Modern Python: Latest Python features and async/await
- ACID Compliance: Strong consistency guarantees
- Complex Queries: Advanced SQL capabilities
- JSON Support: Native JSON columns for flexibility
- Mature Ecosystem: Extensive tooling and support
- SSR/SSG: Better SEO and initial load performance
- File-based Routing: Simplified routing structure
- API Routes: Backend functionality when needed
- Production Ready: Built-in optimizations
- Open Source: No vendor lock-in
- Self-hosted: Full control over data
- GraphQL API: Flexible query interface
- Multi-modal: Support for various data types