LoopViT: Scaling Visual ARC with Looped Transformers

This is the official implementation of LoopViT, a recursive vision transformer architecture designed to solve abstract reasoning tasks in the Abstraction and Reasoning Corpus (ARC).

[Paper] | [Project Page]

Wen-Jie Shu^1,*, Xuerui Qiu², Rui-Jie Zhu³, Harold Haodong Chen¹, Yexin Liu¹, Harry Yang¹

¹HKUST ²CASIA ³UC Santa Cruz
^*Email: wenjieshu2003@gmail.com

🚀 Overview: Rethinking ARC as a Looped Process

Conventional Vision Transformers (ViTs) follow a feed-forward paradigm, where reasoning depth is strictly bound to the parameter count. However, abstract reasoning (ARC) is rarely a single-pass perceptual decision; it resembles an iterative latent deliberation where an internal state is repeatedly refined.

Loop-ViT establishes a new paradigm for visual reasoning by decoupling computational depth from model capacity:

Looped Vision Transformer: We propose the first looped ViT architecture, establishing iterative recurrence as a powerful paradigm for abstract visual reasoning—demonstrating that pure visual representations are sufficient for ARC without needing linguistic or symbolic priors.
Scaling Time over Space: Instead of solely relying on raw capacity ("Space"), Loop-ViT allows models to adapt computational effort ("Time") via a weight-tied Hybrid Block (Convolutions + Global Attention). This design aligns with the local, cellular-update nature of ARC transformations.
Predictive Crystallization (Dynamic Exit): We introduce a parameter-free mechanism where predictions "crystallize" (predictive entropy decays) over iterations. Loop-ViT halts early on easier tasks, significantly improving the accuracy-FLOPs Pareto frontier.
Empirical Superiority:
- Loop-ViT (Small, 3.8M) achieves 60.1% on ARC-AGI-1, surpassing the 18M VARC baseline (54.5%) with 1/5 the parameters.
- Loop-ViT (Large, 18M) reaches 65.8%, outperforming massive ensembles of feed-forward experts.

🛠️ Installation

Clone the repository:

git clone https://github.com/WenjieShu/LoopViT.git
cd LoopViT

Install dependencies:
```
pip install -r requirements.txt
```

📖 Usage

Data Preparation

The model expects the ARC-AGI dataset. Please refer to the raw_data section in the VARC repository for detailed data processing instructions. By default, place the data in raw_data/ARC-AGI.

Training (Offline)

We provide a shell script to replicate our main experimental setup:

# Trains a 6-layer loop-core model (recurring 6 times)
bash script/offline_train_loop_VARC_ViT.sh

This script acts as a wrapper around offline_train_loop_ARC.py with the recommended hyperparameters.

Test-Time Training (TTT)

To reproduce the TTT results on ARC-1:

# Runs TTT on ARC-1 evaluation tasks
bash script/test_time_training_VARC_LoopViT_ARC1.sh

This will iterate over tasks defined in script/arc1_task_list.sh.

Early Exit TTT (Dynamic Compute)

To run TTT with dynamic early exit and visualize the loop steps:

bash script/test_time_training_VARC_LoopViT_ARC1_early_exit.sh

This script enables --exit-on-entropy-stable and saves visualizations of attention maps and reasoning steps.

See script/ for more examples of training and TTT scripts.

🏗️ Project Structure

LoopViT/
├── src/                        # Core model definitions
│   ├── ARC_LoopViT_v1.py       # LoopViT model architecture (v1)
│   ├── ARC_loader.py           # ARC dataset loader & augmentations
│   ├── ARC_ViT.py              # Base ViT components
│   └── attn_hook.py            # Attention hooking for visualization
├── utils/                      # Utilities
│   ├── eval_utils.py           # Evaluation logic
│   ├── eval_utils_ttt.py       # TTT evaluation logic
│   └── vis_renderer.py         # Visualization renderer
├── script/                     # Shell scripts for training/eval
├── offline_train_loop_ARC.py   # Main offline training script
└── test_time_train_ARC.py      # Test-time training interface

Acknowledgements

This codebase builds upon the VARC repository. We thank the authors for their open-source contribution which facilitated our research.

✒️ Citation

If you find our work useful in your research, please consider citing:

@article{shu2026loopvit,
  title={LoopViT: Scaling Visual ARC with Looped Transformers},
  author={Shu, Wen-Jie and Qiu, Xuerui and Zhu, Rui-Jie and Chen, Harold Haodong and Liu, Yexin and Yang, Harry},
  journal={arXiv preprint},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
script		script
src		src
utils		utils
wandb_logs/Large		wandb_logs/Large
.gitignore		.gitignore
README.md		README.md
offline_train_loop_ARC.py		offline_train_loop_ARC.py
requirements.txt		requirements.txt
test_time_train_ARC.py		test_time_train_ARC.py
test_time_train_ARC_vis.py		test_time_train_ARC_vis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LoopViT: Scaling Visual ARC with Looped Transformers

[Paper] | [Project Page]

🚀 Overview: Rethinking ARC as a Looped Process

🛠️ Installation

📖 Usage

Data Preparation

Training (Offline)

Test-Time Training (TTT)

Early Exit TTT (Dynamic Compute)

🏗️ Project Structure

Acknowledgements

✒️ Citation

About

Uh oh!

Releases

Packages

Languages

WenjieShu/LoopViT

Folders and files

Latest commit

History

Repository files navigation

LoopViT: Scaling Visual ARC with Looped Transformers

[Paper] | [Project Page]

🚀 Overview: Rethinking ARC as a Looped Process

🛠️ Installation

📖 Usage

Data Preparation

Training (Offline)

Test-Time Training (TTT)

Early Exit TTT (Dynamic Compute)

🏗️ Project Structure

Acknowledgements

✒️ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages