-
Notifications
You must be signed in to change notification settings - Fork 514
Making LLMAttribute work with BertForMultipleChoice models #1524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Continuing to pursue... first approachThe exception
has had me looking into all the technology around text generation. But I don't see why text generation should even be involved in a multiple choice task? I also ran across recent work on multiple choice as part of logits-processor-zoo, but this logits processing doesn't seem what I need to use with second approach
|
1st approach
i think it shows for huggingface, the implementation of the Bert you are using is different with other more common LLM, like Llama. Their APIs are not compatible. For this specific case, you can further try set this flag to 2nd approachGlad to see you made the 2nd approach work. As you have found, the original error has nth to do with Captum. It means one arg
For the shape |
@aobo-y thank you once again! Re: 1st approach, I agree that the BertForMultipleChoice is too different a model from what Captum expects, related to Re: 2d approach,
first, you're right about the input's shape (batch_size, seq_length, embedding_size). But can you unpack what "impact" means? As an ablation technique, can I assume it has to do with the change in log_probs[target] WITHOUT that input? I'm looking for the formula/ documentation for these specifics. |
🚀 Feature
Allow LLMAttribution goodness to be applied to BERT models for multiple choice tasks
Motivation
following up on suggestions from aobo-y
Pitch
Integrated gradient attribution techniques work over BertForMultipleChoice; it would be great if
FeatureAblation / LLMAttribution did, too.
Alternatives
Two suggestions were made
First approach:
throws error:
dropping
additional_forward_args
parameter gets farther, butthrows:
self.model.prepare_inputs_for_generation
model_inp: tensor, torch.Size([1, 112])
model_kwargs.keys()
Second approach
throws
This is too far into Transformer API-land for me to follow.
Additional context
Additional details in original issue #1523
The text was updated successfully, but these errors were encountered: