Skip to content

ABI-CTT-Group/VeriFlow

Repository files navigation

VeriFlow

Autonomous Research Reliability Engineer — Converts scientific publications into verifiable, executable computational workflows using Gemini 3 AI agents.

Built for the Gemini 3 Hackathon.

Contributors Stargazers GitHub issues-closed Issues License

Table of Contents


About

VeriFlow is an end-to-end platform that tackles the research reproducibility crisis by autonomously converting scientific publications into executable computational workflows. Given a PDF of a research paper, VeriFlow uses a LangGraph-orchestrated pipeline of three specialized Gemini 3 AI agents to extract the methodology, generate standards-compliant CWL v1.3 workflows with Docker containers, validate them through a self-healing retry loop, and review the output for scientific correctness — all with real-time WebSocket streaming to an interactive Vue 3 frontend.

Note: This project was built for the Gemini 3 Hackathon.


The Problem

Scientific research faces a reproducibility crisis: studies report that 70%+ of researchers have failed to reproduce another scientist's experiment. Key barriers include:

  • Implicit methodology — Critical experimental details are buried in PDF publications as unstructured text
  • Missing computational environments — Papers describe tools and models without specifying exact versions, dependencies, or configurations
  • No executable artifacts — Methodologies exist only as prose, not as runnable code
  • Manual workflow creation — Converting a paper's methodology into an executable pipeline requires deep domain expertise and significant engineering effort

Our Solution - VeriFlow

VeriFlow bridges the gap between scientific publications and executable workflows through an autonomous, AI-driven pipeline:

  1. Upload a scientific publication (PDF) with optional user context
  2. Scholar Agent (Gemini 3 Pro) extracts the full methodology as a structured ISA-JSON hierarchy using native PDF upload, grounding with Google Search, and agentic vision for diagram analysis
  3. Engineer Agent (Gemini 3 Pro) generates a complete CWL v1.3 workflow with Dockerfiles, tool definitions, and infrastructure code using the extracted ISA-JSON and repository context
  4. Validate Node checks generated artifacts for structural correctness (Dockerfile has FROM, CWL has cwlVersion) — if errors are found, the self-healing loop retries the Engineer up to 3 times with error context
  5. Reviewer Agent (Gemini 3 Flash) critiques the final output for scientific correctness, comparing the ISA extraction against the generated code
  6. Plan & Apply — Users can chat with any agent, refine directives, and restart the workflow from any node
  7. Export results as a SPARC SDS-compliant ZIP with full provenance tracking

Gemini 3 Integration

VeriFlow leverages 4 Gemini 3 features through the google-genai SDK (from google import genai):

# Feature How It's Used Agent(s)
1 Pydantic Structured Output All agents use Pydantic BaseModel subclasses as response_schema parameter with response_mime_type="application/json" for type-safe, validated JSON responses (AnalysisResult, WorkflowResult, ValidationResult, ErrorTranslationResult) All 3
2 Native PDF Upload types.Part.from_bytes(data=file_data, mime_type="application/pdf") for multimodal publication ingestion — the entire PDF is sent to Gemini for full-document analysis Scholar
3 Thought Signature Preservation _extract_thoughts() captures reasoning chains from response.candidates[].content.parts where part.thought == True, preserving reasoning across multi-turn conversations for iterative CWL generation and validation-fix loops Engineer, Reviewer
4 Async Streaming client.aio.models.generate_content_stream() for real-time token-by-token streaming via WebSocket to the frontend console All 3

Agent Architecture

Agent Model Thinking Budget Responsibilities
ScholarAgent gemini-3-pro-preview HIGH (24,576) PDF analysis, ISA-JSON extraction, confidence scoring, tool/model identification
EngineerAgent gemini-3-pro-preview HIGH (24,576) CWL v1.3 workflow generation, Dockerfile creation, infrastructure code
ReviewerAgent gemini-3-flash-preview MEDIUM (8,192) ISA vs code critique, scientific correctness validation, approval/rejection decision

GeminiClient — Central SDK Wrapper

All Gemini 3 interactions go through a single GeminiClient class:

from google import genai
from google.genai import types

class GeminiClient:
    model_name = "gemini-3.0-flash"  # Default fallback

    async def analyze_file(self, file_path, prompt, model, stream_callback):
        """Native PDF upload via Part.from_bytes + JSON response + async streaming"""

    async def generate_content(self, prompt, model, response_schema, stream_callback):
        """Text-only structured generation with optional streaming"""

    def _extract_thoughts(self, response) -> List[str]:
        """Chain-of-thought extraction from response candidates"""

    def _robust_parse_json(self, text) -> Dict:
        """json_repair-based parsing for Markdown backticks and malformed JSON"""

Architecture

infrastructure.png

Docker Compose — 10 Services

Service Port Purpose
backend 8000 FastAPI backend (Python 3.11)
frontend 3000 Vue 3 SPA via Nginx
postgres 5432 PostgreSQL 15 (Airflow database)
minio 9000/9001 S3-compatible object storage (4 buckets)
minio-init Ephemeral bucket initialization
airflow-apiserver 8080 Airflow 3.0.6 REST API server
airflow-scheduler Airflow task scheduler (LocalExecutor)
dind Docker-in-Docker for CWL execution
cwl CWL runner (cwltool)
veriflow-sandbox Sandbox for script execution (PyTorch + nnU-Net)

Key Features

LangGraph-Orchestrated Multi-Agent Pipeline

A StateGraph with 4 nodes (Scholar, Engineer, Validate, Reviewer) orchestrates the full PDF-to-workflow pipeline. Conditional edges enable a self-healing retry loop where validation failures automatically route back to the Engineer with error context, up to 3 iterations.

Autonomous PDF-to-Workflow Pipeline

Upload a scientific paper and VeriFlow autonomously extracts the methodology, generates executable workflows, validates them through a self-healing loop, and reviews them for scientific correctness — no manual intervention required.

ISA-JSON Study Design Extraction

The Scholar Agent extracts structured investigation hierarchies following the ISA (Investigation-Study-Assay) standard, with per-field confidence scores and source page references using Gemini 3's native PDF upload.

CWL v1.3 Workflow Generation

The Engineer Agent produces standards-compliant Common Workflow Language workflows with:

  • Step-by-step CommandLineTool definitions with InitialWorkDirRequirement embedded scripts
  • Auto-generated Dockerfiles for each tool
  • Data format adapters between incompatible step types
  • Repository context analysis (reads repo files up to 50KB for informed generation)

Self-Healing Validation Loop

The Validate node checks generated artifacts and the LangGraph conditional edges route:

  • Back to Engineer (retry with error context) if validation fails and retry_count < 3
  • Forward to Reviewer (final critique) if validation passes or max retries reached

Plan & Apply — Interactive Agent Consultation

Users can chat with any agent about their output, formulate specific directives, and restart the workflow from any node with those directives applied:

  • POST /api/v1/chat/{run_id}/{agent_name} — Discuss agent output
  • POST /api/v1/chat/{run_id}/{agent_name}/apply — Apply directive and restart

Real-time WebSocket Streaming

All agent output is streamed token-by-token via WebSocket to the frontend console using Gemini 3's generate_content_stream() API, with the SmartMessageRenderer component providing intelligent rendering of JSON, Markdown, Dockerfiles, and CWL code blocks.

SPARC SDS-Compliant Export

Export results as a standards-compliant ZIP containing:

  • dataset_description.json — Dataset metadata
  • manifest.xlsx — File manifest with checksums
  • provenance.json — W3C PROV derivation tracking
  • derivative/ — Output files organized by execution step

Interactive 4-Panel UI

Vue 3 frontend with:

  • Left: PDF upload + ISA hierarchy viewer with confidence scores
  • Center: Interactive Vue Flow workflow graph with custom nodes
  • Right: Results visualization and SDS export
  • Bottom: Real-time console with agent streaming via SmartMessageRenderer

Data Flow Pipeline

Scientific Publication (PDF) + User Context + Repository Path
        |
        v
  [POST /api/v1/orchestrate]
        |
        v
  VeriFlowService.run_workflow()
        |
        v
  +=== LangGraph StateGraph ================================+
  |                                                          |
  |  ScholarAgent (Gemini 3 Pro)                             |
  |  - Native PDF Upload (Part.from_bytes)                   |
  |  - Grounding with Google Search                          |
  |  - Pydantic Structured Output (AnalysisResult)           |
  |  - Thinking: HIGH (24,576) + Async Streaming             |
  |        |                                                 |
  |        v                                                 |
  |  ISA-JSON Hierarchy + Confidence Scores                  |
  |        |                                                 |
  |        v                                                 |
  |  EngineerAgent (Gemini 3 Pro)                            |
  |  - Pydantic Structured Output (WorkflowResult)           |
  |  - Repository Context (up to 50KB of source files)       |
  |  - Previous validation_errors injected into prompt       |
  |  - Thinking: HIGH (24,576) + Async Streaming             |
  |        |                                                 |
  |        v                                                 |
  |  CWL Workflow + Dockerfiles + Infrastructure Code        |
  |        |                                                 |
  |        v                                                 |
  |  Validate Node (System)                                  |
  |  - Dockerfile has FROM instruction?                      |
  |  - CWL has cwlVersion declaration?                       |
  |        |                                                 |
  |    Valid? --No + retry<3--> Back to Engineer              |
  |        |                                                 |
  |       Yes (or max retries)                               |
  |        |                                                 |
  |        v                                                 |
  |  ReviewerAgent (Gemini 3 Flash)                          |
  |  - ISA vs Generated Code critique                        |
  |  - Thought Signature Preservation                        |
  |  - Thinking: MEDIUM (8,192) + Async Streaming            |
  |  - Decision: approved / rejected                         |
  |                                                          |
  +=========================================================+
        |
        v (WebSocket streaming throughout)
  Vue 3 Frontend — Real-time Console + ISA Viewer + Graph
        |
        v (optional)
  ExecutionEngine --> CWLParser --> DAGGenerator
        |
        v
  Airflow 3.0.6 --> DockerOperator --> cwltool
        |
        v
  SDS ZIP Export (dataset_description + manifest + provenance)

Tech Stack

Layer Technology
AI Gemini 3 (google-genai SDK) — gemini-3-pro-preview, gemini-3-flash-preview
Orchestration LangGraph (StateGraph with conditional edges, self-healing retry loop)
Backend Python 3.11, FastAPI, Pydantic, uvicorn, json-repair
Frontend Vue 3.5, Vue Flow 1.41, Pinia, Tailwind CSS 4, TypeScript, Vite 6, markdown-it
Real-time WebSocket (FastAPI native), SmartMessageRenderer
Execution Apache Airflow 3.0.6 (LocalExecutor), CWL v1.3, Docker-in-Docker, cwltool
Storage SQLite (sessions), PostgreSQL 15 (Airflow), MinIO (S3-compatible, 4 buckets)
Standards ISA-JSON, SPARC SDS, CWL v1.3, W3C PROV

Quick Start

Prerequisites

Setup

# 1. Clone the repository
git clone https://github.com/ABI-CTT-Group/VeriFlow.git
cd VeriFlow

# 2. Configure environment
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

# 3. Start all services
docker compose up -d

# 4. Open the app
# Frontend:        http://localhost:3000
# Backend API:     http://localhost:8000/docs
# Airflow UI:      http://localhost:8080
# MinIO Console:   http://localhost:9001

Development (without Docker)

# Backend
cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000

# Frontend
cd frontend
npm install
npm run dev

Project Structure

VeriFlow/
+-- backend/                     # Python FastAPI backend
|   +-- app/
|   |   +-- agents/              # ScholarAgent, EngineerAgent, ReviewerAgent (class-based)
|   |   +-- api/                 # 5 REST API routers + WebSocket endpoint
|   |   +-- graph/               # LangGraph StateGraph + node implementations
|   |   |   +-- workflow.py      # StateGraph definition with conditional edges
|   |   |   +-- nodes.py         # scholar_node, engineer_node, validate_node, reviewer_node
|   |   +-- models/              # Pydantic schemas (Gemini structured output)
|   |   +-- services/            # GeminiClient, VeriFlowService, WebSocketManager, SQLiteDB
|   |   +-- state.py             # AgentState TypedDict (LangGraph shared state)
|   |   +-- main.py              # FastAPI entry point
|   +-- config.yaml              # Agent model & thinking level configuration
|   +-- prompts.yaml             # Versioned prompt templates per agent
|   +-- examples/                # Pre-computed agent outputs for MAMA-MIA demo
|   +-- tests/                   # pytest tests (unit + integration)
+-- frontend/                    # Vue 3 + TypeScript + Tailwind CSS 4
|   +-- src/
|   |   +-- components/          # Vue components including SmartMessageRenderer
|   |   +-- stores/              # Pinia workflow + console stores
|   |   +-- services/            # API client (axios) + WebSocket service
|   |   +-- utils/               # dagre layout utilities
+-- airflow/                     # Custom Airflow 3.0.6 image + DAGs
+-- cwl/                         # CWL runner service (cwltool)
+-- sandbox/                     # Sandbox Docker environment (PyTorch + nnU-Net)
+-- docs/                        # Architecture diagrams (Mermaid, draw.io, about, testing)
+-- docker-compose.yml           # 10-service orchestration (development)
+-- docker-compose.prod.yml      # Production configuration (GHCR images)
+-- .env.example                 # Environment variable template
+-- SPEC.md                      # Technical specification

Testing

# Backend unit tests
cd backend && python -m pytest tests/ -v

# Backend tests in Docker
docker compose run --rm backend pytest tests/ -v

# Frontend tests (Vitest)
cd frontend && npx vitest run

Documentation

License

VeriFlow is fully open source and distributed under the Apache License 2.0. See LICENSE for more information.

Team

Acknowledgements

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors