VeriFlow

Autonomous Research Reliability Engineer — Converts scientific publications into verifiable, executable computational workflows using Gemini 3 AI agents.

Built for the Gemini 3 Hackathon.

About

VeriFlow is an end-to-end platform that tackles the research reproducibility crisis by autonomously converting scientific publications into executable computational workflows. Given a PDF of a research paper, VeriFlow uses a LangGraph-orchestrated pipeline of three specialized Gemini 3 AI agents to extract the methodology, generate standards-compliant CWL v1.3 workflows with Docker containers, validate them through a self-healing retry loop, and review the output for scientific correctness — all with real-time WebSocket streaming to an interactive Vue 3 frontend.

Note: This project was built for the Gemini 3 Hackathon.

The Problem

Scientific research faces a reproducibility crisis: studies report that 70%+ of researchers have failed to reproduce another scientist's experiment. Key barriers include:

Implicit methodology — Critical experimental details are buried in PDF publications as unstructured text
Missing computational environments — Papers describe tools and models without specifying exact versions, dependencies, or configurations
No executable artifacts — Methodologies exist only as prose, not as runnable code
Manual workflow creation — Converting a paper's methodology into an executable pipeline requires deep domain expertise and significant engineering effort

Our Solution - VeriFlow

VeriFlow bridges the gap between scientific publications and executable workflows through an autonomous, AI-driven pipeline:

Upload a scientific publication (PDF) with optional user context
Scholar Agent (Gemini 3 Pro) extracts the full methodology as a structured ISA-JSON hierarchy using native PDF upload, grounding with Google Search, and agentic vision for diagram analysis
Engineer Agent (Gemini 3 Pro) generates a complete CWL v1.3 workflow with Dockerfiles, tool definitions, and infrastructure code using the extracted ISA-JSON and repository context
Validate Node checks generated artifacts for structural correctness (Dockerfile has FROM, CWL has cwlVersion) — if errors are found, the self-healing loop retries the Engineer up to 3 times with error context
Reviewer Agent (Gemini 3 Flash) critiques the final output for scientific correctness, comparing the ISA extraction against the generated code
Plan & Apply — Users can chat with any agent, refine directives, and restart the workflow from any node
Export results as a SPARC SDS-compliant ZIP with full provenance tracking

Gemini 3 Integration

VeriFlow leverages 4 Gemini 3 features through the google-genai SDK (from google import genai):

#	Feature	How It's Used	Agent(s)
1	Pydantic Structured Output	All agents use Pydantic `BaseModel` subclasses as `response_schema` parameter with `response_mime_type="application/json"` for type-safe, validated JSON responses (`AnalysisResult`, `WorkflowResult`, `ValidationResult`, `ErrorTranslationResult`)	All 3
2	Native PDF Upload	`types.Part.from_bytes(data=file_data, mime_type="application/pdf")` for multimodal publication ingestion — the entire PDF is sent to Gemini for full-document analysis	Scholar
3	Thought Signature Preservation	`_extract_thoughts()` captures reasoning chains from `response.candidates[].content.parts` where `part.thought == True`, preserving reasoning across multi-turn conversations for iterative CWL generation and validation-fix loops	Engineer, Reviewer
4	Async Streaming	`client.aio.models.generate_content_stream()` for real-time token-by-token streaming via WebSocket to the frontend console	All 3

Agent Architecture

Agent	Model	Thinking Budget	Responsibilities
ScholarAgent	`gemini-3-pro-preview`	HIGH (24,576)	PDF analysis, ISA-JSON extraction, confidence scoring, tool/model identification
EngineerAgent	`gemini-3-pro-preview`	HIGH (24,576)	CWL v1.3 workflow generation, Dockerfile creation, infrastructure code
ReviewerAgent	`gemini-3-flash-preview`	MEDIUM (8,192)	ISA vs code critique, scientific correctness validation, approval/rejection decision

GeminiClient — Central SDK Wrapper

All Gemini 3 interactions go through a single GeminiClient class:

from google import genai
from google.genai import types

class GeminiClient:
    model_name = "gemini-3.0-flash"  # Default fallback

    async def analyze_file(self, file_path, prompt, model, stream_callback):
        """Native PDF upload via Part.from_bytes + JSON response + async streaming"""

    async def generate_content(self, prompt, model, response_schema, stream_callback):
        """Text-only structured generation with optional streaming"""

    def _extract_thoughts(self, response) -> List[str]:
        """Chain-of-thought extraction from response candidates"""

    def _robust_parse_json(self, text) -> Dict:
        """json_repair-based parsing for Markdown backticks and malformed JSON"""

Architecture

Docker Compose — 10 Services

Service	Port	Purpose
`backend`	8000	FastAPI backend (Python 3.11)
`frontend`	3000	Vue 3 SPA via Nginx
`postgres`	5432	PostgreSQL 15 (Airflow database)
`minio`	9000/9001	S3-compatible object storage (4 buckets)
`minio-init`	—	Ephemeral bucket initialization
`airflow-apiserver`	8080	Airflow 3.0.6 REST API server
`airflow-scheduler`	—	Airflow task scheduler (LocalExecutor)
`dind`	—	Docker-in-Docker for CWL execution
`cwl`	—	CWL runner (cwltool)
`veriflow-sandbox`	—	Sandbox for script execution (PyTorch + nnU-Net)

Key Features

LangGraph-Orchestrated Multi-Agent Pipeline

A StateGraph with 4 nodes (Scholar, Engineer, Validate, Reviewer) orchestrates the full PDF-to-workflow pipeline. Conditional edges enable a self-healing retry loop where validation failures automatically route back to the Engineer with error context, up to 3 iterations.

Autonomous PDF-to-Workflow Pipeline

Upload a scientific paper and VeriFlow autonomously extracts the methodology, generates executable workflows, validates them through a self-healing loop, and reviews them for scientific correctness — no manual intervention required.

ISA-JSON Study Design Extraction

The Scholar Agent extracts structured investigation hierarchies following the ISA (Investigation-Study-Assay) standard, with per-field confidence scores and source page references using Gemini 3's native PDF upload.

CWL v1.3 Workflow Generation

The Engineer Agent produces standards-compliant Common Workflow Language workflows with:

Step-by-step CommandLineTool definitions with InitialWorkDirRequirement embedded scripts
Auto-generated Dockerfiles for each tool
Data format adapters between incompatible step types
Repository context analysis (reads repo files up to 50KB for informed generation)

Self-Healing Validation Loop

The Validate node checks generated artifacts and the LangGraph conditional edges route:

Back to Engineer (retry with error context) if validation fails and retry_count < 3
Forward to Reviewer (final critique) if validation passes or max retries reached

Plan & Apply — Interactive Agent Consultation

Users can chat with any agent about their output, formulate specific directives, and restart the workflow from any node with those directives applied:

POST /api/v1/chat/{run_id}/{agent_name} — Discuss agent output
POST /api/v1/chat/{run_id}/{agent_name}/apply — Apply directive and restart

Real-time WebSocket Streaming

All agent output is streamed token-by-token via WebSocket to the frontend console using Gemini 3's generate_content_stream() API, with the SmartMessageRenderer component providing intelligent rendering of JSON, Markdown, Dockerfiles, and CWL code blocks.

SPARC SDS-Compliant Export

Export results as a standards-compliant ZIP containing:

dataset_description.json — Dataset metadata
manifest.xlsx — File manifest with checksums
provenance.json — W3C PROV derivation tracking
derivative/ — Output files organized by execution step

Interactive 4-Panel UI

Vue 3 frontend with:

Left: PDF upload + ISA hierarchy viewer with confidence scores
Center: Interactive Vue Flow workflow graph with custom nodes
Right: Results visualization and SDS export
Bottom: Real-time console with agent streaming via SmartMessageRenderer

Data Flow Pipeline

Scientific Publication (PDF) + User Context + Repository Path
        |
        v
  [POST /api/v1/orchestrate]
        |
        v
  VeriFlowService.run_workflow()
        |
        v
  +=== LangGraph StateGraph ================================+
  |                                                          |
  |  ScholarAgent (Gemini 3 Pro)                             |
  |  - Native PDF Upload (Part.from_bytes)                   |
  |  - Grounding with Google Search                          |
  |  - Pydantic Structured Output (AnalysisResult)           |
  |  - Thinking: HIGH (24,576) + Async Streaming             |
  |        |                                                 |
  |        v                                                 |
  |  ISA-JSON Hierarchy + Confidence Scores                  |
  |        |                                                 |
  |        v                                                 |
  |  EngineerAgent (Gemini 3 Pro)                            |
  |  - Pydantic Structured Output (WorkflowResult)           |
  |  - Repository Context (up to 50KB of source files)       |
  |  - Previous validation_errors injected into prompt       |
  |  - Thinking: HIGH (24,576) + Async Streaming             |
  |        |                                                 |
  |        v                                                 |
  |  CWL Workflow + Dockerfiles + Infrastructure Code        |
  |        |                                                 |
  |        v                                                 |
  |  Validate Node (System)                                  |
  |  - Dockerfile has FROM instruction?                      |
  |  - CWL has cwlVersion declaration?                       |
  |        |                                                 |
  |    Valid? --No + retry<3--> Back to Engineer              |
  |        |                                                 |
  |       Yes (or max retries)                               |
  |        |                                                 |
  |        v                                                 |
  |  ReviewerAgent (Gemini 3 Flash)                          |
  |  - ISA vs Generated Code critique                        |
  |  - Thought Signature Preservation                        |
  |  - Thinking: MEDIUM (8,192) + Async Streaming            |
  |  - Decision: approved / rejected                         |
  |                                                          |
  +=========================================================+
        |
        v (WebSocket streaming throughout)
  Vue 3 Frontend — Real-time Console + ISA Viewer + Graph
        |
        v (optional)
  ExecutionEngine --> CWLParser --> DAGGenerator
        |
        v
  Airflow 3.0.6 --> DockerOperator --> cwltool
        |
        v
  SDS ZIP Export (dataset_description + manifest + provenance)

Tech Stack

Layer	Technology
AI	Gemini 3 (`google-genai` SDK) — `gemini-3-pro-preview`, `gemini-3-flash-preview`
Orchestration	LangGraph (`StateGraph` with conditional edges, self-healing retry loop)
Backend	Python 3.11, FastAPI, Pydantic, uvicorn, json-repair
Frontend	Vue 3.5, Vue Flow 1.41, Pinia, Tailwind CSS 4, TypeScript, Vite 6, markdown-it
Real-time	WebSocket (FastAPI native), SmartMessageRenderer
Execution	Apache Airflow 3.0.6 (LocalExecutor), CWL v1.3, Docker-in-Docker, cwltool
Storage	SQLite (sessions), PostgreSQL 15 (Airflow), MinIO (S3-compatible, 4 buckets)
Standards	ISA-JSON, SPARC SDS, CWL v1.3, W3C PROV

Quick Start

Prerequisites

Docker & Docker Compose
A Gemini API key from Google AI Studio

Setup

# 1. Clone the repository
git clone https://github.com/ABI-CTT-Group/VeriFlow.git
cd VeriFlow

# 2. Configure environment
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

# 3. Start all services
docker compose up -d

# 4. Open the app
# Frontend:        http://localhost:3000
# Backend API:     http://localhost:8000/docs
# Airflow UI:      http://localhost:8080
# MinIO Console:   http://localhost:9001

Development (without Docker)

# Backend
cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000

# Frontend
cd frontend
npm install
npm run dev

Project Structure

VeriFlow/
+-- backend/                     # Python FastAPI backend
|   +-- app/
|   |   +-- agents/              # ScholarAgent, EngineerAgent, ReviewerAgent (class-based)
|   |   +-- api/                 # 5 REST API routers + WebSocket endpoint
|   |   +-- graph/               # LangGraph StateGraph + node implementations
|   |   |   +-- workflow.py      # StateGraph definition with conditional edges
|   |   |   +-- nodes.py         # scholar_node, engineer_node, validate_node, reviewer_node
|   |   +-- models/              # Pydantic schemas (Gemini structured output)
|   |   +-- services/            # GeminiClient, VeriFlowService, WebSocketManager, SQLiteDB
|   |   +-- state.py             # AgentState TypedDict (LangGraph shared state)
|   |   +-- main.py              # FastAPI entry point
|   +-- config.yaml              # Agent model & thinking level configuration
|   +-- prompts.yaml             # Versioned prompt templates per agent
|   +-- examples/                # Pre-computed agent outputs for MAMA-MIA demo
|   +-- tests/                   # pytest tests (unit + integration)
+-- frontend/                    # Vue 3 + TypeScript + Tailwind CSS 4
|   +-- src/
|   |   +-- components/          # Vue components including SmartMessageRenderer
|   |   +-- stores/              # Pinia workflow + console stores
|   |   +-- services/            # API client (axios) + WebSocket service
|   |   +-- utils/               # dagre layout utilities
+-- airflow/                     # Custom Airflow 3.0.6 image + DAGs
+-- cwl/                         # CWL runner service (cwltool)
+-- sandbox/                     # Sandbox Docker environment (PyTorch + nnU-Net)
+-- docs/                        # Architecture diagrams (Mermaid, draw.io, about, testing)
+-- docker-compose.yml           # 10-service orchestration (development)
+-- docker-compose.prod.yml      # Production configuration (GHCR images)
+-- .env.example                 # Environment variable template
+-- SPEC.md                      # Technical specification

Testing

# Backend unit tests
cd backend && python -m pytest tests/ -v

# Backend tests in Docker
docker compose run --rm backend pytest tests/ -v

# Frontend tests (Vitest)
cd frontend && npx vitest run

Documentation

Technical Specification — Full implementation spec with API endpoints, data structures, and agent details

License

VeriFlow is fully open source and distributed under the Apache License 2.0. See LICENSE for more information.

Team

Acknowledgements

We would like to thank the organizers of the Gemini Hachathon for organising this event and all the support doucmnetation
ARDC Nectar Research Cloud for providing the infrastructure for hosting this project

Name		Name	Last commit message	Last commit date
Latest commit History 183 Commits
.claude		.claude
.github/workflows		.github/workflows
airflow		airflow
backend		backend
cwl		cwl
docs		docs
examples		examples
frontend		frontend
planning		planning
reports		reports
sandbox		sandbox
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
Create_Plan.md		Create_Plan.md
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md
SPEC.md		SPEC.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
technical_implementation_analysis_report.txt		technical_implementation_analysis_report.txt

Folders and files

Latest commit

History

Repository files navigation

VeriFlow

Table of Contents

About

The Problem

Our Solution - VeriFlow

Gemini 3 Integration

Agent Architecture

GeminiClient — Central SDK Wrapper

Architecture

Docker Compose — 10 Services

Key Features

LangGraph-Orchestrated Multi-Agent Pipeline

Autonomous PDF-to-Workflow Pipeline

ISA-JSON Study Design Extraction

CWL v1.3 Workflow Generation

Self-Healing Validation Loop

Plan & Apply — Interactive Agent Consultation

Real-time WebSocket Streaming

SPARC SDS-Compliant Export

Interactive 4-Panel UI

Data Flow Pipeline

Tech Stack

Quick Start

Prerequisites

Setup

Development (without Docker)

Project Structure

Testing

Documentation

License

Team

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 26

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages