-
accepted in 2025 CVPR Workshop Vizwiz Grand Challenge
-
under review in 2025 Knowledge Based Systems(Q1, IF : 7.6)
For the best performing model, you can download our best performing model from Best model: Download here
(https://drive.google.com/drive/folders/1-aADgu93SDutxhjZQpT5ARD_a6msxSNS?usp=drive_link)
Key ideas
HQD (Hierarchical Question Decomposition): Decomposes complex questions into sub-queries (e.g., subject → attribute → relation) to mitigate language bias.
EM (Ensemble + Margin): Ensembles models with different backbones/training seeds and expands the decision boundary using an Adaptive Angular Margin.
Table on VQA CP2
You can see that our model outperforms other methods overall; in particular, it achieves strong results on the challenging Num and Other categories.

