Skip to content

JianJinglin/awesome-agentic-AIScientists

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Agentic AI Scientists

A curated list of papers on Agentic Multimodal Large Language Models (MLLMs) for Scientific Discovery

🚀 Join us in building the AI for Science community! Know a great paper we missed? Open an issue — together, let's accelerate scientific discovery with AI!

This repository accompanies our survey paper: "Exploring Agentic Multimodal Large Language Models: A Survey for AIScientists"

AIScientist GitHub Repository Overview

What is an AIScientist?

AIScientists are autonomous agents powered by multimodal large language models (MLLMs) that can understand papers, generate hypotheses, plan and conduct experiments, analyze results, and draft manuscripts across the entire scientific research lifecycle (Lu et al., 2024; Boiko et al., 2023; Gottweis et al., 2025). But how do we build one? This survey summarizes a complete pipeline for developing multimodal agentic AIScientists, with representative studies spanning 10 scientific domains.

Comparison with Related Surveys

Paper Taxonomy Ag. DM. Method HCI Ben. #Dom.
Zhang et al. (2024) Domain Seq.+ Train. only 6
Gridach et al. (2025) Sci. Workflow Infer. only 4
Luo et al. (2025) Sci. Workflow
Zhang et al. (2025) Sci. Workflow Seq.+
Ren et al. (2025) Agent Composition Train. & Infer. 6+
Wei et al. (2025) Auto. & Domain Infer. only 4
Hu et al. (2025) Data & Domain 6+
Ours ML Pipeline Train. & Infer. 10

Ag. = Agentic AI; DM. = Data Modality; HCI = Human-Computer Interaction; Ben. = Benchmark; #Dom. = Number of domains; Seq.+ = Sequence and more modalities; Train. = Agent Training; Infer. = Agent Inference

Ours: An End-to-End Developer Pipeline

Overview of the agentic MLLM framework for scientific discovery

Overview of our framework: Starting from diverse Input & Output modalities, through Agent Training and Inference methods, to Evaluation benchmarks, with Human-AI Collaboration integrated at every stage.


Table of Contents


⚙️ Methods for Scientific MLLM Agents

🏋️ Agent Training

Supervised Fine-Tuning (SFT)

  • In-Context Learning with Long-Context Models: An In-Depth Exploration (2025) - Bertsch et al.
  • RLSF: Fine-tuning LLMs via Symbolic Feedback (2024) - Jha et al.
  • Efficient Fine-Tuning of Single-Cell Foundation Models Enables Zero-Shot Molecular Perturbation Prediction (2025) - Maleki et al.
  • Training language models to follow instructions with human feedback (2022) - Ouyang et al.
  • InstructProtein: Aligning Human and Protein Language via Knowledge Instruction (2023) - Wang et al.
  • Mavis: Mathematical visual instruction tuning with an automatic data engine (2024) - Zhang et al.

Reinforcement Learning (RL)

Including RLHF, DPO, and reward-based training methods.

Contrastive & Adversarial Learning

  • Semi-supervised learning-based virtual adversarial training on graph for molecular property prediction (2025) - Lu et al.
  • Triplet Contrastive Learning Framework With Adversarial Hard-Negative Sample Generation for Multimodal Remote Sensing Images (2024) - Chen et al.
  • Generating mutants of monotone affinity towards stronger protein complexes through adversarial learning (2024) - Lan et al.
  • Recent advances in generative adversarial networks for gene expression data: a comprehensive review (2023) - Lee, Minhyeok
  • Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning (2021) - Li et al.
  • Drug repositioning based on residual attention network and free multiscale adversarial training (2024) - Li et al.
  • Dinov2: Learning robust visual features without supervision (2023) - Oquab, Maxime, Darcet, Timothe
  • Robust image representations with counterfactual contrastive learning (2025) - Roschewitz, Me
  • Improved Techniques for Training GANs (2016) - Salimans et al.
  • SupReMix: Supervised Contrastive Learning for Medical Imaging Regression with Mixup (2025) - Wu et al.

🚀 Agent Inference

Retrieval-Augmented Generation (RAG)

In-Context Learning (ICL)

  • Language Models are Few-Shot Learners (2020) - Brown et al.
  • A Survey on In-context Learning (2024) - Dong et al.
  • Discovering New Theorems via LLMs with In-Context Proof Learning in Lean (2025) - Kasaura et al.
  • What Makes Good In-Context Examples for GPT-$3 $? (2021) - Liu et al.
  • What Makes In-context Learning Effective for Mathematical Reasoning: A Theoretical Analysis (2024) - Liu et al.
  • Prottex: Structure-in-context reasoning and editing of proteins with large language models (2025) - Ma et al.
  • Chain-of-thought prompting elicits reasoning in large language models (2022) - Wei et al.

Planning & Tool Use

🤝 Multi-Agent Systems


📈 Benchmarks & Evaluation


🧑‍🔬 Human-AI Collaboration


License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors