Skip to content

Conversation

strickvl
Copy link
Contributor

Summary

Introduces the ZenML Deep Research Agent - a production-ready MLOps pipeline for conducting deep, comprehensive research on any topic using LLMs and web search capabilities.

Key Features

  • Structured Research: Creates outlines and researches each section through targeted web searches
  • Parallel Processing: Concurrent sub-question processing for faster results
  • Iterative Refinement: Reflection cycles to improve content quality
  • Human-in-the-Loop: Optional approval mechanism for search queries
  • Comprehensive Tracking: Cost tracking, tracing metadata, and LLM observability via Langfuse

Architecture Highlights

  • Query decomposition into specific sub-questions
  • Cross-viewpoint analysis for balanced perspectives
  • Multiple research configurations (quick, balanced, thorough, enhanced)
  • Pydantic-based data models for structured outputs
  • Static HTML report generation with interactive visualizations

Integrations

  • LLM: Flexible provider support via litellm
  • Search: Tavily API (default) and Exa API support
  • Observability: Langfuse integration for LLM tracking
  • Orchestration: Full ZenML pipeline with caching and reproducibility

🤖 Generated with Claude Code

@strickvl strickvl added enhancement New feature or request internal labels May 27, 2025
@strickvl strickvl requested review from htahir1 and Copilot May 27, 2025 17:30
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@strickvl strickvl requested a review from Copilot May 27, 2025 17:37
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds the Deep Research Agent pipeline along with extensive supporting tests, documentation, and new configuration files for various research modes. Key changes include:

  • Implementation and testing of cost tracking for Exa search queries
  • Detailed design documents covering pydantic migration, prompt cost visualization, and configuration of the research pipelines
  • New scripts for LiteLLM testing and budget handling in pipeline steps

Reviewed Changes

Copilot reviewed 65 out of 65 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
deep_research/design/test_exa_cost_tracking.py New test cases for Exa cost tracking
deep_research/design/sample_observation.md Sample observation document for research outputs
deep_research/design/pydantic_migration.md Documentation for migrating dataclasses to Pydantic models
deep_research/design/prompt_cost_visualization.md Design document for prompt-based cost visualization
deep_research/design/lite_test.py Test script for validating LiteLLM responses
deep_research/design/exa_cost_tracking_summary.md Summary of Exa cost tracking implementation
deep_research/design/exa_cost_tracking_fixes.md Documentation of fixes applied to Exa cost tracking formatting issues
deep_research/design/budget_test_pipeline.py Pipeline test for setting budget and testing LLM functionality
deep_research/configs/* Multiple new configuration files for different research scenarios
Comments suppressed due to low confidence (1)

deep_research/design/budget_test_pipeline.py:25

  • Remove the leftover debugging 'breakpoint()' statement to avoid unintended interruptions in production execution.
breakpoint()


# Configuration
LITELLM_BASE_URL = "https://litellm-service-5ikaahlouq-uc.a.run.app"
API_KEY = "zenmllitellm"
Copy link
Preview

Copilot AI May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid hardcoding API keys in the source code; instead, load them from environment variables or secure configuration files.

Copilot uses AI. Check for mistakes.

@strickvl strickvl force-pushed the feature/deep-research branch from 2842ae7 to fcacbf5 Compare May 27, 2025 17:50
@strickvl strickvl closed this May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant