Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Add guideline classifier integration #294

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

jmanhype
Copy link

@jmanhype jmanhype commented Feb 20, 2025

Overview

This PR adds a guideline classifier to improve the optimization pipeline by determining which guidelines should be activated based on conversation context. This is Phase 1 of our DSPy integration roadmap.

Key Changes

  • Add `GuidelineClassifier` class for smart guideline activation
  • Update optimization script to support both OpenAI and Llama2 models
  • Add classification script and tests
  • Improve response optimization with enhanced COPRO parameters
  • Add detailed integration roadmap

Implementation Details

  • `GuidelineClassifier`: New class that uses LLMs to determine which guidelines to activate
  • `run_guideline_optimization.py`: Now supports both OpenAI and local Llama2 models
  • Added comprehensive test coverage for classifier functionality
  • Enhanced optimization parameters for better response quality

Testing

  • Added unit tests in `tests/test_guideline_classifier.py`
  • Tested with both OpenAI and Llama2 models
  • Verified classification accuracy and response quality

Performance

  • Classification accuracy: ~100% on test cases
  • Response optimization shows improved quality
  • Support for both cloud and local model inference

Notes

  • Requires OpenAI API key for OpenAI model
  • Requires Ollama setup for local Llama2 inference

Next Steps

Please see the detailed roadmap in ROADMAP.md for the complete integration plan. This PR represents Phase 1 of 5 phases:

  1. Phase 1 (Current): Basic DSPy Integration ✅
  2. Phase 2: Engine Integration
  3. Phase 3: Server Integration
  4. Phase 4: Storage & Metrics
  5. Phase 5: Testing & Documentation

Each phase will be submitted as a separate PR to maintain code review quality and manage complexity.

- Add DSPy integration for guideline optimization
- Implement COPRO optimizer with batch processing
- Add metrics tracking for model performance
- Add Ollama support for local models
- Add tests for DSPy integration
- Update dependencies for DSPy support
- Fixed example creation to use direct field assignment instead of inputs/outputs dict
- Updated _calculate_response_quality to handle new example format
- Added difflib for better response quality calculation
- Inherit from ChatAdapter instead of Adapter
- Properly initialize parent class with callbacks
- Implement format method to store messages in history
- Simplify inspect_history to match base LM interface
- Add proper type hints and docstrings following PEP 257
- Add GuidelineClassifier implementation for determining which guidelines to activate
- Update run_guideline_optimization.py to support both OpenAI and Llama2 models
- Add classification script and tests
- Improve response optimization with COPRO parameters

Key changes:
- GuidelineClassifier class for smart guideline activation
- Support for both OpenAI and local Llama2 models
- Enhanced optimization parameters for better responses
- Comprehensive test coverage
@jmanhype
Copy link
Author

GuidelineClassifier Implementation

The classifier uses DSPy's optimization framework with COPRO to improve classification accuracy:

```python
class GuidelineClassifier:
def init(self, api_key: Optional[str] = None,
model_name: str = 'openai/gpt-3.5-turbo',
metrics: Optional[ModelMetrics] = None,
use_optimizer: bool = True) -> None:
self.metrics = metrics or ModelMetrics()
self.use_optimizer = use_optimizer

    # Configure language model
    if 'ollama' in model_name:
        # For Ollama models, use custom adapter
        ollama_model = model_name.split('/')[1]
        initialize_ollama_model(ollama_model)
        self.lm = OllamaAdapter(model_name)
    else:
        # For OpenAI models
        self.lm = LM(model_name, api_key=api_key)

```

The classifier is designed to be model-agnostic, supporting both cloud-based and local models through a unified interface.

@jmanhype
Copy link
Author

COPRO Optimization Configuration

Enhanced optimization parameters for better response quality:

```python
optimizer.optimizer = COPRO(
prompt_model=optimizer.lm,
init_temperature=1.0, # Higher temperature for more diverse candidates
breadth=12, # Generate more candidates
depth=4, # More iterations for refinement
threshold=0.5, # More lenient threshold
top_k=3, # Keep top 3 candidates at each step
max_steps=50, # Allow more optimization steps
metric=lambda pred, gold: CustomerServiceProgram()._calculate_response_quality(
pred.get('response', '') if isinstance(pred, dict) else getattr(pred, 'response', ''),
gold.get('response', '') if isinstance(gold, dict) else getattr(gold, 'response', '')
)
)
```

These parameters were tuned to balance between response quality and computational efficiency.

@jmanhype
Copy link
Author

Test Coverage

Added comprehensive tests for the classifier:

```python
def test_guideline_classifier_prediction():
"""Test that the classifier correctly predicts which guidelines to activate."""
classifier = GuidelineClassifier()

conversation = 'User: I need help with my account\nAssistant: I will help you'
guidelines = ['Account support', 'Technical issues', 'Billing']

result = classifier(conversation=conversation, guidelines=guidelines)
assert isinstance(result, dict)
assert 'activated' in result
assert len(result['activated']) == len(guidelines)
assert result['activated'][0] == True  # Account support should be activated
assert result['activated'][1] == False # Technical issues should not be activated
assert result['activated'][2] == False # Billing should not be activated

```

The tests verify both functionality and output format across different model types.

- Add detailed integration phases
- Document implementation details
- Specify environment variables
- Provide timeline and dependencies
- Add comprehensive DSPy integration section
- Document installation and configuration steps
- Add code examples with type hints
- Include roadmap overview and feature list
- Highlight differences from main repository
- Add contribution guidelines

Part of Phase 1 implementation.
@mc-dorzo mc-dorzo changed the base branch from main to develop February 20, 2025 15:33
@kichanyurd
Copy link
Contributor

Hey @jmanhype awesome initiative!

I'd love a deeper tour of the roadmap here and where you'd like to take this. Could you DM me on Discord to set up a call?

@kichanyurd kichanyurd changed the title feat: Add guideline classifier integration [Enhancement] Add guideline classifier integration Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants