Skip to content

stride-research/impress-agentic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

IMPRESS-LLM

Adaptive protein design through LLM-guided optimization with FlowCademy orchestration

Python 3.8+ License: MIT

Overview

IMPRESS-LLM is an advanced protein design framework that leverages Large Language Models (LLMs) to intelligently guide protein sequence optimization. Built on FlowCademy's powerful agent-based architecture, it orchestrates state-of-the-art protein engineering tools through a modular, scalable system of autonomous agents.

Key Innovation

Traditional protein design workflows use fixed optimization strategies. IMPRESS-LLM introduces adaptive intelligence where:

  • Each component (sequence optimization, structure prediction, scoring) is an independent Academy agent
  • LLM agents analyze optimization trajectories in real-time and dynamically adjust strategies
  • Agents communicate through FlowCademy's type-safe remote method invocation
  • AsyncFlow workflows enable parallel processing of multiple sequences
  • The entire system scales from local execution to HPC environments

Architecture

FlowCademy-Based Design

IMPRESS-LLM is built on three foundational technologies:

  1. Academy: Provides the agent framework with autonomous actors
  2. AsyncFlow: Handles workflow orchestration and task dependencies
  3. FlowCademy: Integrates Academy agents with AsyncFlow workflows
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                FlowCademy Integration Layer             β”‚
β”‚         Bridges Academy agents with AsyncFlow           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚                β”‚                β”‚
β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Academy    β”‚ β”‚   AsyncFlow  β”‚ β”‚ Agent Manager    β”‚
β”‚   Agents     β”‚ β”‚   Workflows  β”‚ β”‚ & Communication  β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ └─────┬───────-β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚               β”‚                  β”‚
   β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”      β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
   β”‚Proteinβ”‚      β”‚Workflow β”‚         β”‚Exchange β”‚
   β”‚Agents β”‚      β”‚Tasks    β”‚         β”‚Clients  β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Agent Architecture

# Each component is an autonomous Academy agent
class SequenceOptimizerAgent(Agent):
    """Handles mutation strategies and sequence optimization"""

    @action
    async def apply_mutations(self, sequence: str,
                            mutation_rate: float) -> str:
        # Intelligent mutation logic

    @action
    async def crossover_sequences(self, seq1: str, seq2: str) -> str:
        # Genetic crossover operations

class ReportGeneratorAgent(Agent):
    """Manages result collection and visualization"""

    @action
    async def generate_report(self) -> str:
        # Creates comprehensive analysis reports

Installation

# Clone repository
git clone https://github.com/impress-llm/impress-llm.git
cd impress-llm

# Install dependencies including FlowCademy
pip install -r requirements.txt

# Configure environment
cp .env.template .env
# Edit .env and add your OpenRouter API key

Quick Start

1. Basic Workflow with FlowCademy

# Test with mock tools (no GPU required)
export MOCK_MODE=true
cd src
python main.py

How FlowCademy Primitives Are Used

1. Agent Task Creation

FlowCademy's create_agent_task bridges Academy agents with AsyncFlow:

# Initialize workflow engine
backend = ThreadExecutionBackend({})
flow = WorkflowEngine(backend=backend)

async with AcademyWorkflowIntegration(flow) as integration:
    # Create agent tasks
    af_task = integration.create_agent_task(
        AlphaFoldAgent, 'predict_structure',
        agent_id=f'af_{sequence_name}'
    )

    scorer_task = integration.create_agent_task(
        ScoringAgent, 'predict_affinity',
        agent_id=f'scorer_{sequence_name}'
    )

2. Workflow Composition

AsyncFlow's function tasks enable complex workflows:

@flow.function_task
async def predict_structure():
    return await af_task(current_sequence)

@flow.function_task
async def score_structure(struct_result):
    return await scorer_task(struct_result)

# Execute workflow with implicit dependency tracking
struct_result = await predict_structure()
score_result = await score_structure(struct_result)

3. Agent Communication

Academy's Handle protocol enables remote agent invocation:

# Agents can invoke actions on other agents
optimizer_task = integration.create_agent_task(
    SequenceOptimizerAgent, 'apply_mutations'
)

# Update parameters across agents
await optimizer_update_task(
    mutation_rate=0.2,
    temperature=0.5
)

4. State Management

Each agent maintains independent state:

class SequenceOptimizerAgent(Agent):
    mutation_rate: float
    temperature: float
    optimization_history: Dict[str, List]

    async def on_setup(self) -> None:
        """Initialize agent state"""
        self.mutation_rate = 0.1
        self.temperature = 0.5
        self.optimization_history = {}

Adaptive Strategies

The LLM Strategy Agent analyzes optimization progress and implements sophisticated strategies:

1. Dynamic Mutation Control

{
  "action": "continue",
  "parameters": {
    "mutation_rate": 0.15,
    "temperature": 0.3,
    "focus_regions": [[45, 67], [102, 115]]
  },
  "reason": "Focusing mutations on flexible loop regions"
}

2. Cross-Sequence Learning

The system leverages FlowCademy's agent communication to share insights across parallel optimizations, enabling knowledge transfer between sequences.

3. Scalable Execution

FlowCademy supports multiple execution backends:

  • ThreadExecutionBackend: Local multi-threaded execution
  • DaskExecutionBackend: Distributed computing clusters
  • RadicalExecutionBackend: HPC environments with SLURM/PBS

Example Results

OPTIMIZATION PROGRESS (FlowCademy Orchestrated)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Sequence: Target_A
Agent ID: optimizer_Target_A
Initial Score: 0.623 β†’ Final Score: 0.891
Iterations: 7
Active Agents: 4 (AlphaFold, Scorer, LLM, Optimizer)

Agent Communication Log:
- optimizer β†’ llm: Request strategy (iteration 3)
- llm β†’ optimizer: Increase mutation rate
- scorer β†’ reporter: New best score 0.891
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Configuration

Environment Variables

# LLM Configuration
OPENROUTER_API_KEY=your-key-here
LLM_MODEL=moonshotai/kimi-k2
LLM_API_BASE=https://openrouter.ai/api/v1

# Workflow Settings
MAX_ITERATIONS=10
FASTA_PATTERN=proteins/*.fasta
MOCK_MODE=true  # Set false for real tools

# FlowCademy Settings
FLOWCADEMY_BACKEND=thread  # thread, dask, or radical
FLOWCADEMY_MAX_AGENTS=10

Advanced Usage

Custom Agent Development

from academy.agent import Agent, action
from flowcademy import AcademyWorkflowIntegration

class CustomAnalysisAgent(Agent):
    """Custom agent for specialized analysis"""

    @action
    async def analyze_trajectory(self, history: List[dict]) -> dict:
        # Implement custom analysis
        return {"insights": analysis_results}

# Register with FlowCademy
async with AcademyWorkflowIntegration(flow) as integration:
    analysis_task = integration.create_agent_task(
        CustomAnalysisAgent, 'analyze_trajectory'
    )

Distributed Execution

# Configure for HPC execution
from radical.asyncflow import RadicalExecutionBackend

backend = RadicalExecutionBackend({
    'resource': 'summit.olcf.ornl.gov',
    'project': 'your-project',
    'queue': 'batch',
    'walltime': 120,
    'cpus': 128
})

flow = WorkflowEngine(backend=backend)

Testing

# Run FlowCademy integration tests
python test_flowcademy_workflow.py

# Test individual agents
python -m pytest tests/test_agents.py

# Test workflow orchestration
python -m pytest tests/test_workflows.py

Benefits of FlowCademy Architecture

  1. Modularity: Each component is an independent, reusable agent
  2. Scalability: Seamlessly scale from laptop to supercomputer
  3. Fault Tolerance: Agent failures are isolated and recoverable
  4. Observability: Built-in logging and monitoring capabilities
  5. Type Safety: Academy's action decorators ensure type-safe communication
  6. Async Native: Leverages Python's asyncio for efficient concurrency

About

Agentic representation of the IMPRESS workflow using Flowcademy and the AI Agentic Framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors