Skip to content

DEFENSE-SEU/RobustFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RobustFlow: Towards Robust Agentic Workflow Generation

Arxiv PR Welcome Github stars

Shengxiang Xu (徐圣翔)*Logo,     Jiayi Zhang (张佳钇)*Logo,     Shimin Di (邸世民)Logo,    

Yuyu Luo (骆昱宇)Logo,     Liang Yao (姚亮)Logo,     Hanmo Liu (刘翰墨)Logo,    

Jia Zhu (朱佳)Logo,     Fan Liu (刘凡)Logo,     Min-Ling Zhang (张敏灵)Logo,    

* Equal ContributionCorresponding Author

If you encounter any difficulties in using or reproducing the code, please get in touch with me directly (Email: xushx@seu.edu.cn, Wechat: 13270628738).

Introduction

Welcome to the official repository of our paper "RobustFlow: Towards Robust Agentic Workflow Generation"!

The automated generation of agentic workflows is a promising frontier for enabling large language models (LLMs) to solve complex tasks. However, our investigation reveals that the robustness of agentic workflow remains a critical, unaddressed challenge. Current methods often generate wildly inconsistent workflows when provided with instructions that are semantically identical but differently phrased. This brittleness severely undermines their reliability and trustworthiness for real-world applications.

To quantitatively diagnose this instability, we propose metrics based on nodal and topological similarity to evaluate workflow consistency against common semantic variations such as paraphrasing and noise injection.

robust_evaluation_metric

Subsequently, we further propose a novel training framework, RobustFlow, that leverages preference optimization to teach models invariance to instruction variations.

method_overview

By training on sets of synonymous task descriptions, RobustFlow boosts workflow robustness scores to 70% - 90%, which is a substantial improvement over existing approaches.

experiments

Quick Start

  1. Setup the Python environment:

    conda create -n robustflow python=3.9
    pip install -r requirements.txt
  2. Data Preparation

    You can download our prepared datasets or reproduce them locally.

    • Place the official original file in the dataset folder (example: noise_dataset/DROP/drop_original.jsonl).

    • Run the rewrite script in that folder:

      cd noise_dataset/DROP/
      python rewrite_drop.py

      This generates:

      • drop_paraphrasing.jsonl
      • drop_requirements.jsonl
      • drop_light_noise.jsonl
      • drop_moderate_noise.jsonl
      • drop_heavy_noise.jsonl

    If you want to analyze the dataset, you can refer to the examples under noise_dataset/Distribution/ and follow the steps below:

    cd noise_dataset/Distribution
    bash extract.sh

    This will generate dataset embeddings in the embedding/ directory. To Analyze and visualize, you can either write your own script or use the provided ones:

    python analyze.py
    python draw.py

    These scripts compute statistics and clustering results from the embeddings, and generate distribution visualizations in the visual/ directory.

  3. Baseline Evaluation

    Clone the official repositories of AFlow, ScoreFlow and Flow into AFlow/, Scoreflow/ and Flow/, and run each project strictly following its README to reproduce the baseline results as-is.

    • AFlow Evaluation

      cd evaluate/
      bash infer_aflow.sh
      python aflow_scripts/find.py
      python eval_aflow.py
      cat aflow_score.txt
    • ScoreFlow Evaluation

    • Flow Evaluation

  4. Additional case studies are available in samples/ for qualitative analysis.

Citation

If you use RobustFlow in your research, please cite our paper:

@article{xu2025robustflow,
  title={RobustFlow: Towards Robust Agentic Workflow Generation},
  author={Xu, Shengxiang and Zhang, Jiayi and Di, Shimin and Luo, Yuyu and Yao, Liang and Liu, Hanmo and Zhu, Jia and Liu, Fan and Zhang, Min-Ling},
  journal={arXiv preprint arXiv:2509.21834},
  year={2025}
}

Star History

Star History Chart

About

Official Repo of "RobustFlow: Towards Robust Agentic Workflow Generation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors