Skip to content

[EMNLP 2025 Findings] Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

License

Notifications You must be signed in to change notification settings

FreedomIntelligence/MatCha

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

📃 Paper | 🤗 Dataset | 💻 Code

📖 Overview

Materials characterization plays a key role in understanding the processing–microstructure–property relationships that guide material design and optimization. While multimodal large language models (MLLMs) have shown promise in generative and predictive tasks, their ability to interpret real-world characterization imaging data remains underexplored.

MatCha is the first benchmark designed specifically for materials characterization image understanding. It provides a comprehensive evaluation framework that reflects real challenges faced by materials scientists.

✨ Key Features

  • 1,500 expert-level questions focused on materials characterization.

  • Covers 4 stages of materials research across 21 distinct tasks.

  • Tasks designed to mimic real-world scientific challenges.

  • Provides the first systematic evaluation of MLLMs on materials characterization.

📁 Repository Structure

MatCha/
├── MatCha_Data/data/                     # need to be downloaded from huggingface
└── src
    ├── lf_model_cfg/
    ├── eval.py
    ├── models.py
    ├── score.py
    └── utils.py

🗃️ Dataset Access

The dataset is available on 🤗 MatCha under the license.

🚀 Quick Start

Follow the steps below to get started with the evaluation.

1. Clone the Repository

git clone https://github.com/FreedomIntelligence/MatCha
cd MatCha

2. Download the Dataset

huggingface-cli download \
    --repo-type dataset \
    --resume-download \
    ./FreedomIntelligence/MatCha \
    --local-dir MatCha_Data

This will download the complete dataset (files, images) into MatCha_Data.

3. Run Evaluation

cd ./src/

python eval.py \
    --model gpt-4o \
    --method zero-shot

python score.py \
    --output_path path/to/output/file

📄 Citation

If you find our work helpful, please use the following citation.

@misc{lai2025matcha,
      title={Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization}, 
      author={Zhengzhao Lai and Youbin Zheng and Zhenyang Cai and Haonan Lyu and Jinpu Yang and Hongqing Liang and Yan Hu and Benyou Wang},
      year={2025},
      eprint={2509.09307},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.09307}, 
}

About

[EMNLP 2025 Findings] Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages