Skip to content

DMCB-GIST/TransferCVLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

TransferCVLM

TransferCVLM is a method of efficient knowledge transfer from an existing vision-language pre-trained model to a combination of two unimodal pre-trained models for vision and language.

Workflow

Image

Requirements

Python==3.8
Pytorch==1.10.0
transformers==4.35.0

Execution guide

i) Run "run_flava_{TASK}.py" or "run_git_{TASK}.py" to obtain teacher model.

ii) Run "run_cvlm_{TASK}.py" or "run_cvlm_captioning_{model}.py" to obtain fine-tuned CVLM model. (Phase 1)

iii) Run "transfer_flava2cvlm_{task}.py" or "transfer_git2cvlm_captioning_{model}.py"to obtain final model.(Phase 2) Requires step i) and ii) results.

iv) Run "transfer_cvlm2cvlm_{task}.py" to obtain Phase 2^MC model described in section 2.3 and 3.4. Requires step iii) and new i) results.

Citation

@inproceedings{choi2024transfercvlm,
  title={TransferCVLM: Transferring Cross-Modal Knowledge for Vision-Language Modeling},
  author={Choi, Dongha and Kim, Jung Jae and Lee, Hyunju},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
  pages={16733--16746},
  year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages