A multi-agent system for automatically generating GenePattern modules.
The generate-module.py script orchestrates multiple AI agents to:
- Research bioinformatics tools using web search and analysis
- Plan module structure, parameters, and architecture
- Generate module artifacts (Dockerfile, wrapper scripts, manifests, etc.)
- Validate each artifact using the Module Toolkit linters
- Create a complete, ready-to-use GenePattern module
-
Environment Setup:
pip install -r requirements.txt
-
Environment Variables: The default values for the environment variables should be fine for most installations. However, if you wish to make changes, you may want to edit an .env with your API keys and preferences
-
Environment Variables:
DEFAULT_LLM_MODEL: LLM model for agents (default: Qwen3)BRAVE_API_KEY: For web research (optional but recommended if you have one)MAX_ARTIFACT_LOOPS: Max validation retry attempts (default: 5)MODULE_OUTPUT_DIR: Output directory (default: ./generated-modules)
Run the script and follow the prompts:
python generate-module.pyYou'll be prompted for:
- Tool name (required)
- Tool version (optional)
- Primary language (optional)
- Brief description (optional)
- Repository URL (optional)
- Documentation URL (optional)
Tool name: samtools
Tool version: 1.19
Primary language: C
Brief description: Tools for manipulating SAM/BAM files
Repository URL: https://github.com/samtools/samtools
Documentation URL: http://www.htslib.org/doc/samtools.html
- Agent:
researcher_agent - Purpose: Gather comprehensive information about the tool
- Actions:
- Web search for documentation and examples
- Analyze command-line interface and parameters
- Identify dependencies and requirements
- Research common usage patterns
- Agent:
planner_agent - Purpose: Create implementation plan based on research
- Actions:
- Map parameters to GenePattern types
- Design parameter groupings for UI
- Plan module architecture and dependencies
- Define validation and testing strategy
- Agents: Multiple artifact-specific agents
- Current Artifacts (in generation order):
wrapper_agent: Generates wrapper scripts for tool integrationmanifest_agent: Creates module manifest with metadata and command lineparamgroups_agent: Creates parameter groupings for UI organizationgpunit_agent: Generates test definitions for automated testingdocumentation_agent: Generates user documentationdockerfile_agent: Creates Dockerfile
For each artifact:
- Generate content using specialized agent
- Write to module directory
- Validate using appropriate linter tool
- If validation fails, retry up to
MAX_ARTIFACT_LOOPStimes - Include feedback from previous attempts in retry prompts
Generated modules are saved to {MODULE_OUTPUT_DIR}/{tool_name}_{timestamp}/:
samtools_20241222_143022/
├── wrapper.py # Execution wrapper script
├── manifest # Module metadata and command line
├── paramgroups.json # UI parameter groups
├── test.yml # GPUnit test definition
├── README.md # User documentation
└── Dockerfile # Container definition
The script provides real-time status updates:
[14:30:22] INFO: Creating module directory for samtools
[14:30:22] INFO: Created module directory: ./generated-modules/samtools_20241222_143022
[14:30:22] INFO: Starting research on the bioinformatics tool
[14:30:25] INFO: Research phase completed successfully
[14:30:25] INFO: Starting module planning based on research findings
[14:30:28] INFO: Planning phase completed successfully
[14:30:28] INFO: Starting artifact generation
[14:30:28] INFO: Generating dockerfile...
[14:30:31] INFO: Attempt 1/5 for dockerfile
[14:30:34] INFO: Generated Dockerfile (1847 characters)
[14:30:34] INFO: Validating dockerfile...
[14:30:37] INFO: Validation passed for dockerfile
[14:30:37] INFO: Successfully generated and validated dockerfile
After completion, you'll receive a comprehensive report:
============================================================
Module Generation Report
============================================================
Tool Name: samtools
Module Directory: ./generated-modules/samtools_20241222_143022
Research Complete: ✓
Planning Complete: ✓
Artifact Status:
wrapper:
Generated: ✓
Validated: ✓
Attempts: 1
manifest:
Generated: ✓
Validated: ✓
Attempts: 1
paramgroups:
Generated: ✓
Validated: ✓
Attempts: 1
gpunit:
Generated: ✓
Validated: ✓
Attempts: 1
documentation:
Generated: ✓
Validated: ✓
Attempts: 1
dockerfile:
Generated: ✓
Validated: ✓
Attempts: 1
Parameters Identified: 23
- input_file: File (Required)
- output_format: Choice (Optional)
- quality_threshold: Integer (Optional)
- threads: Integer (Optional)
- memory_limit: Text (Optional)
... and 18 more parameters
============================================================
🎉 MODULE GENERATION SUCCESSFUL!
Your GenePattern module is ready in: ./generated-modules/samtools_20241222_143022
============================================================
The script follows Pydantic AI best practices for multi-agent systems:
- Agent Specialization: Each agent has a focused domain expertise
- Structured Communication: Agents pass structured data between phases
- Error Handling: Robust error handling with retry mechanisms
- Validation Integration: Built-in validation using MCP server tools
- Status Tracking: Comprehensive progress monitoring and reporting