Hi! The script MME_score.py cannot work properly and there is a bug in the code. For example, in the judgment
|
if gt_question.lower() in pred_caption: |
|
pred_ans = True |
|
else: |
|
pred_ans = False |
the pred_caption is a single word (i.e. Yes or No), whereas gt_question.lower() is a sentence. As a result,
pred_ans = False almost always holds true. BTW, Can you add comments to the key parts of the code? Thanks a lot! :)
Hi! The script MME_score.py cannot work properly and there is a bug in the code. For example, in the judgment
HALC/eval/MME_score.py
Lines 124 to 127 in fc32840
the pred_caption is a single word (i.e. Yes or No), whereas gt_question.lower() is a sentence. As a result,
pred_ans = Falsealmost always holds true. BTW, Can you add comments to the key parts of the code? Thanks a lot! :)