v1.5.0 Release Notes by sopaco · Pull Request #96 · sopaco/deepwiki-rs

sopaco · 2026-04-05T06:40:47Z

🔥 Major Fix: LLM Deserialization Reliability

This release resolves a long-standing issue that affected production usage across different LLM providers.

The Problem

Previously, Litho relied on LLMs to strictly follow JSON Schema and return perfectly formatted structured JSON. In real-world scenarios with complex projects and various models (especially non-OpenAI providers like Ollama, DeepSeek, etc.), this caused frequent parsing errors and failures. The system was brittle and would completely fail when models deviated from the expected format.

The Solution

We've implemented a comprehensive, multi-layered approach to ensure reliable structured data extraction:

1. Lenient Deserialization with Intelligent Fallbacks

Added robust fallback handling that gracefully manages malformed JSON responses
Multiple parsing strategies: strict schema → relaxed schema → text extraction
Detailed error context for debugging while maintaining operation
Automatic recovery from common LLM formatting mistakes

2. Provider-Specific Extractors

OllamaExtractorWrapper: For Ollama (and similar text-only models), adds smart JSON extraction from markdown code blocks with retry logic
OpenAICompatibleExtractorWrapper: Enhanced support for OpenAI-compatible APIs with structured output where available
Unified interface that automatically selects the appropriate extraction strategy per provider

3. Enhanced Prompt Engineering

Refined system prompts to emphasize JSON formatting requirements
Added specific instructions for edge cases (null values, optional fields, array formats)
Better guidance on nested object structures

4. Configurable Retry & Backoff

Exponential backoff with jitter for rate limits
Attempt counts per provider configurable via llm.retry_attempts
Detailed logging of failures and retry attempts

Impact: Users should now see significantly fewer failures when using:

Local models (Ollama, Llama, etc.)
Alternative cloud providers (DeepSeek, Moonshot, Mistral, OpenRouter)
Complex schemas with nested structures
Large codebases requiring extensive analysis

🗄️ Database Documentation Generation (New Feature)

Introduced DatabaseOverviewAnalyzer agent
Analyzes SQL database projects: tables, views, stored procedures, relationships
Generates comprehensive database documentation with schema relationships
Supports filtering by knowledge categories for context-aware analysis
Especially useful for SQL Server and data warehouse projects

📄 External Knowledge Integration (New Feature)

Added powerful local_docs integration for importing external documentation:

Supported File Types: PDF, Markdown, Text, SQL, YAML, JSON

Key Capabilities:

Category-based document organization (e.g., "architecture", "api", "database")
Targeted delivery: documents can be assigned to specific agents
Configurable chunking: semantic, fixed-size, or paragraph-based
Overlap control for context preservation
Real-time file watching for live updates
Smart caching to avoid re-processing unchanged files

Example Configuration:

[knowledge.local_docs]
enabled = true
categories = [
  { name = "architecture", paths = ["docs/architecture/*.md"], target_agents = ["ResearchAgent"] },
  { name = "database", paths = ["db/**/*.sql"], chunking = { max_chunk_size = 8000 } }
]

🤖 LLM Provider Enhancements

OpenAI-compatible provider support: Works with any OpenAI-compatible API endpoint
Updated default models: More reliable model selections per provider
New providers tested: Gemini, Anthropic, Mistral integration improvements
Better error recovery: Automatic fallover to alternative models via llm.fallover_model config

📊 Statistics

103 files changed
22,455 insertions(+), 15,698 deletions(-)
500+ lines of new documentation
15+ new modules and types

This is a major stability release that significantly improves reliability across all LLM providers.

Full Changelog: 1.2.6...1.5.0

Migration Notes

No breaking changes for existing users - All changes are backward compatible
New features are opt-in via configuration:
- knowledge.local_docs section for external documentation
- boundary_analysis section for performance tuning
BoundaryAnalyzer now outputs structured JSON (internal change, no user action needed)
The old documentation format (__Litho_Summary_*) is deprecated but still supported

Special thanks to contributors who helped test and improve LLM compatibility across diverse setups!

Migrated from ReAct multi-turn pattern to single-turn prompting after rig-core 0.34 removed the multi_turn method. Removed target language dependency and partial result handling. Added lenient deserializers for numeric fields to handle varied LLM output formats.

Improve LLM extraction reliability Add OpenAI-compatible extractor with HTTP fallback for problematic APIs. Simplify CodeInsightLLMOutput schema to essential fields and enforce strict JSON output format. Switch BoundaryAnalyzer and WorkflowResearcher from structured outputs to String. Update reqwest dependency with json feature.

sopaco added 4 commits March 19, 2026 06:24

bump version to 1.3.5

50947e0

Switch BoundaryAnalyzer to structured JSON output

6cda939

sopaco merged commit 4d0cb5c into main Apr 5, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.5.0 Release Notes#96

v1.5.0 Release Notes#96
sopaco merged 4 commits into
mainfrom
dev

sopaco commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sopaco commented Apr 5, 2026

🔥 Major Fix: LLM Deserialization Reliability

The Problem

The Solution

🗄️ Database Documentation Generation (New Feature)

📄 External Knowledge Integration (New Feature)

🤖 LLM Provider Enhancements

📊 Statistics

Migration Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant