IMPRESS-LLM

Adaptive protein design through LLM-guided optimization with FlowCademy orchestration

Overview

IMPRESS-LLM is an advanced protein design framework that leverages Large Language Models (LLMs) to intelligently guide protein sequence optimization. Built on FlowCademy's powerful agent-based architecture, it orchestrates state-of-the-art protein engineering tools through a modular, scalable system of autonomous agents.

Key Innovation

Traditional protein design workflows use fixed optimization strategies. IMPRESS-LLM introduces adaptive intelligence where:

Each component (sequence optimization, structure prediction, scoring) is an independent Academy agent
LLM agents analyze optimization trajectories in real-time and dynamically adjust strategies
Agents communicate through FlowCademy's type-safe remote method invocation
AsyncFlow workflows enable parallel processing of multiple sequences
The entire system scales from local execution to HPC environments

Architecture

FlowCademy-Based Design

IMPRESS-LLM is built on three foundational technologies:

Academy: Provides the agent framework with autonomous actors
AsyncFlow: Handles workflow orchestration and task dependencies
FlowCademy: Integrates Academy agents with AsyncFlow workflows

┌─────────────────────────────────────────────────────────┐
│                FlowCademy Integration Layer             │
│         Bridges Academy agents with AsyncFlow           │
└─────────────────────┬───────────────────────────────────┘
                      │
     ┌────────────────┼────────────────┐
     │                │                │
┌────▼─────────┐ ┌───▼──────────┐ ┌──▼───────────────┐
│   Academy    │ │   AsyncFlow  │ │ Agent Manager    │
│   Agents     │ │   Workflows  │ │ & Communication  │
└──────┬───────┘ └─────┬───────-┘ └────────┬─────────┘
       │               │                  │
   ┌───▼───┐      ┌────▼────┐         ┌────▼────┐
   │Protein│      │Workflow │         │Exchange │
   │Agents │      │Tasks    │         │Clients  │
   └───────┘      └─────────┘         └─────────┘

Agent Architecture

# Each component is an autonomous Academy agent
class SequenceOptimizerAgent(Agent):
    """Handles mutation strategies and sequence optimization"""

    @action
    async def apply_mutations(self, sequence: str,
                            mutation_rate: float) -> str:
        # Intelligent mutation logic

    @action
    async def crossover_sequences(self, seq1: str, seq2: str) -> str:
        # Genetic crossover operations

class ReportGeneratorAgent(Agent):
    """Manages result collection and visualization"""

    @action
    async def generate_report(self) -> str:
        # Creates comprehensive analysis reports

Installation

# Clone repository
git clone https://github.com/impress-llm/impress-llm.git
cd impress-llm

# Install dependencies including FlowCademy
pip install -r requirements.txt

# Configure environment
cp .env.template .env
# Edit .env and add your OpenRouter API key

Quick Start

1. Basic Workflow with FlowCademy

# Test with mock tools (no GPU required)
export MOCK_MODE=true
cd src
python main.py

How FlowCademy Primitives Are Used

1. Agent Task Creation

FlowCademy's create_agent_task bridges Academy agents with AsyncFlow:

# Initialize workflow engine
backend = ThreadExecutionBackend({})
flow = WorkflowEngine(backend=backend)

async with AcademyWorkflowIntegration(flow) as integration:
    # Create agent tasks
    af_task = integration.create_agent_task(
        AlphaFoldAgent, 'predict_structure',
        agent_id=f'af_{sequence_name}'
    )

    scorer_task = integration.create_agent_task(
        ScoringAgent, 'predict_affinity',
        agent_id=f'scorer_{sequence_name}'
    )

2. Workflow Composition

AsyncFlow's function tasks enable complex workflows:

@flow.function_task
async def predict_structure():
    return await af_task(current_sequence)

@flow.function_task
async def score_structure(struct_result):
    return await scorer_task(struct_result)

# Execute workflow with implicit dependency tracking
struct_result = await predict_structure()
score_result = await score_structure(struct_result)

3. Agent Communication

Academy's Handle protocol enables remote agent invocation:

# Agents can invoke actions on other agents
optimizer_task = integration.create_agent_task(
    SequenceOptimizerAgent, 'apply_mutations'
)

# Update parameters across agents
await optimizer_update_task(
    mutation_rate=0.2,
    temperature=0.5
)

4. State Management

Each agent maintains independent state:

class SequenceOptimizerAgent(Agent):
    mutation_rate: float
    temperature: float
    optimization_history: Dict[str, List]

    async def on_setup(self) -> None:
        """Initialize agent state"""
        self.mutation_rate = 0.1
        self.temperature = 0.5
        self.optimization_history = {}

Adaptive Strategies

The LLM Strategy Agent analyzes optimization progress and implements sophisticated strategies:

1. Dynamic Mutation Control

{
  "action": "continue",
  "parameters": {
    "mutation_rate": 0.15,
    "temperature": 0.3,
    "focus_regions": [[45, 67], [102, 115]]
  },
  "reason": "Focusing mutations on flexible loop regions"
}

2. Cross-Sequence Learning

The system leverages FlowCademy's agent communication to share insights across parallel optimizations, enabling knowledge transfer between sequences.

3. Scalable Execution

FlowCademy supports multiple execution backends:

ThreadExecutionBackend: Local multi-threaded execution
DaskExecutionBackend: Distributed computing clusters
RadicalExecutionBackend: HPC environments with SLURM/PBS

Example Results

OPTIMIZATION PROGRESS (FlowCademy Orchestrated)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Sequence: Target_A
Agent ID: optimizer_Target_A
Initial Score: 0.623 → Final Score: 0.891
Iterations: 7
Active Agents: 4 (AlphaFold, Scorer, LLM, Optimizer)

Agent Communication Log:
- optimizer → llm: Request strategy (iteration 3)
- llm → optimizer: Increase mutation rate
- scorer → reporter: New best score 0.891
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Configuration

Environment Variables

# LLM Configuration
OPENROUTER_API_KEY=your-key-here
LLM_MODEL=moonshotai/kimi-k2
LLM_API_BASE=https://openrouter.ai/api/v1

# Workflow Settings
MAX_ITERATIONS=10
FASTA_PATTERN=proteins/*.fasta
MOCK_MODE=true  # Set false for real tools

# FlowCademy Settings
FLOWCADEMY_BACKEND=thread  # thread, dask, or radical
FLOWCADEMY_MAX_AGENTS=10

Advanced Usage

Custom Agent Development

from academy.agent import Agent, action
from flowcademy import AcademyWorkflowIntegration

class CustomAnalysisAgent(Agent):
    """Custom agent for specialized analysis"""

    @action
    async def analyze_trajectory(self, history: List[dict]) -> dict:
        # Implement custom analysis
        return {"insights": analysis_results}

# Register with FlowCademy
async with AcademyWorkflowIntegration(flow) as integration:
    analysis_task = integration.create_agent_task(
        CustomAnalysisAgent, 'analyze_trajectory'
    )

Distributed Execution

# Configure for HPC execution
from radical.asyncflow import RadicalExecutionBackend

backend = RadicalExecutionBackend({
    'resource': 'summit.olcf.ornl.gov',
    'project': 'your-project',
    'queue': 'batch',
    'walltime': 120,
    'cpus': 128
})

flow = WorkflowEngine(backend=backend)

Testing

# Run FlowCademy integration tests
python test_flowcademy_workflow.py

# Test individual agents
python -m pytest tests/test_agents.py

# Test workflow orchestration
python -m pytest tests/test_workflows.py

Benefits of FlowCademy Architecture

Modularity: Each component is an independent, reusable agent
Scalability: Seamlessly scale from laptop to supercomputer
Fault Tolerance: Agent failures are isolated and recoverable
Observability: Built-in logging and monitoring capabilities
Type Safety: Academy's action decorators ensure type-safe communication
Async Native: Leverages Python's asyncio for efficient concurrency

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
examples		examples
openrouter-docs		openrouter-docs
src		src
.env.template		.env.template
.gitignore		.gitignore
README.md		README.md
config_template.py		config_template.py
requirements.txt		requirements.txt
setup_openrouter.sh		setup_openrouter.sh
test_structure.pdb		test_structure.pdb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMPRESS-LLM

Overview

Key Innovation

Architecture

FlowCademy-Based Design

Agent Architecture

Installation

Quick Start

1. Basic Workflow with FlowCademy

How FlowCademy Primitives Are Used

1. Agent Task Creation

2. Workflow Composition

3. Agent Communication

4. State Management

Adaptive Strategies

1. Dynamic Mutation Control

2. Cross-Sequence Learning

3. Scalable Execution

Example Results

Configuration

Environment Variables

Advanced Usage

Custom Agent Development

Distributed Execution

Testing

Benefits of FlowCademy Architecture

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IMPRESS-LLM

Overview

Key Innovation

Architecture

FlowCademy-Based Design

Agent Architecture

Installation

Quick Start

1. Basic Workflow with FlowCademy

How FlowCademy Primitives Are Used

1. Agent Task Creation

2. Workflow Composition

3. Agent Communication

4. State Management

Adaptive Strategies

1. Dynamic Mutation Control

2. Cross-Sequence Learning

3. Scalable Execution

Example Results

Configuration

Environment Variables

Advanced Usage

Custom Agent Development

Distributed Execution

Testing

Benefits of FlowCademy Architecture

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages