A collection of referring image segmentation papers and datasets.
Feel free to create a PR or an issue.
Outline
Short name | Paper | Source | Code/Project Link |
---|---|---|---|
MeViS | MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions | ICCV 2023 | [dataset] [project] |
gRefCOCO | GRES: Generalized Referring Expression Segmentation | CVPR 2023 | [dataset] [project] |
ClevrTex | ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation | NeurIPS Datasets and Benchmarks 2021 | [project] |
ScanRefer | ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language | ECCV 2020 | [project] |
VGPhraseCut | PhraseCut: Language-based Image Segmentation in the Wild | CVPR 2020 | [project] |
CLEVR-Ref+ | CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions | CVPR 2019 | [project] |
UNC | Modeling context in referring expressions | ECCV 2016 | [dataset] |
UNC+ | Modeling context in referring expressions | ECCV 2016 | [dataset] |
Google-Ref | Generation and comprehension of unambiguous object descriptions | CVPR 2016 | [dataset] |
ReferIt | Referit game: Referring to objects in photographs of natural scenes | EMNLP 2014 | [project] |
Name | Workshop | Date | Submission Link |
---|---|---|---|
1st MeViS Challenge | CVPR 2024 Workshop: Pixel-level Video Understanding in the Wild | May 2024 | [CodaLab] |
RVOS Challenge | ECCV 2024 Workshop: The 6th Large-scale Video Object Segmentation Challenge | Aug 2024 | [CodaLab] |
Short name | Paper | Source | Code/Project Link |
---|---|---|---|
PhraseClick | PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click | ECCV 2020 |
Short name | Paper | Source | Code/Project Link |
---|---|---|---|
X-RefSeg3D | X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks | AAAI 2024 | [code] |
3D-STMN | 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation | AAAI 2024 | [code] |
SegPoint | SegPoint: Segment Any Point Cloud via Large Language Model | ECCV 2024 | [project] |
3D-GRES | 3D-GRES: Generalized 3D Referring Expression Segmentation | ACM MM 2024 | [code] |
RefMask3D | RefMask3D: Language-Guided Transformer for 3D Referring Segmentation | ACM MM 2024 | [code] |
TGNN | Text-Guided Graph Neural Networks for Referring 3D Instance Segmentation | AAAI 2021 | |
InstanceRefer | InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring | ICCV 2021 | [code] |