Add UnQovering dataset support (QA) #495

dcecchini · 2023-06-05T20:28:19Z

Add this dataset for QA tests (bias).

Reference: https://github.com/allenai/unqover

alytarik · 2023-07-24T09:21:59Z

UnQover is a fairly unique dataset in the sense that it does not have "correct" labels and it uses the answers directly to check bias. Below is an example data sample. Our approach is to change some stuff in the input and then test the model, this samples and dataset is not very suitable for that. We can skip this for now or maybe you have some ideas @dcecchini

dcecchini · 2023-07-24T12:12:11Z

Hi @alytarik, I think what is important on this dataset is more on the process they used to generate those questions. They already identified cases in which they can automatically generate the questions that have high probability of containing bias. The lists of adjectives, templates, etc are present on files:

They also have scripts to create and fill the templates. Checking their visualization demo, we can see nice examples that could be a new feature of LangTest -- not giving a score on specific test, but an analysis tool to help researchers visualize how the model behaves given some inputs that may contain bias. For example:

What do you think, @JulesBelveze ?

JulesBelveze · 2023-07-24T12:53:16Z

I also think it would be a great addition.

I would argue that in the setting of "underspecified context" the model should not answer anything. @alytarik in the example you shared the model should actually produce something like "I don't know" or "I am lacking context" but it shouldn't answer "Alice" nor "Justin", right?

@dcecchini I also really like what's under the "Under-specified Question" section of the demo you shared. We could definitely let the user choose a bias category (say "ethnicity"), use the templates to generate samples, and compute a bias score. Basically, let the user perform exactly what their demo does. What do you think?

@alytarik I can give you a hand on how to design a solution to integrate to langtest

JulesBelveze · 2023-07-26T13:54:36Z

@alytarik how is that going?

alytarik · 2023-07-27T08:55:30Z

@JulesBelveze i was focused on #579 for a while. I will be working on this after i finish up its tests etc.

JulesBelveze added the ⏭️ Next Release Issues or Request for the next release label Jul 17, 2023

ArshaanNazir assigned alytarik Jul 19, 2023

ArshaanNazir added v2.1.0 Issue or request to be done in v2.1.0 release and removed ⏭️ Next Release Issues or Request for the next release labels Jul 19, 2023

ArshaanNazir added ⏭️ Next Release Issues or Request for the next release and removed v2.1.0 Issue or request to be done in v2.1.0 release labels Sep 4, 2023

ArshaanNazir assigned chakravarthik27 and unassigned alytarik Sep 5, 2023

ArshaanNazir added v2.1.0 Issue or request to be done in v2.1.0 release and removed ⏭️ Next Release Issues or Request for the next release labels Sep 6, 2023

chakravarthik27 added ⏭️ Next Release Issues or Request for the next release and removed v2.1.0 Issue or request to be done in v2.1.0 release labels Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add UnQovering dataset support (QA) #495

Add UnQovering dataset support (QA) #495

dcecchini commented Jun 5, 2023 •

edited

Loading

alytarik commented Jul 24, 2023

dcecchini commented Jul 24, 2023

JulesBelveze commented Jul 24, 2023

JulesBelveze commented Jul 26, 2023

alytarik commented Jul 27, 2023

Add UnQovering dataset support (QA) #495

Add UnQovering dataset support (QA) #495

Comments

dcecchini commented Jun 5, 2023 • edited Loading

alytarik commented Jul 24, 2023

dcecchini commented Jul 24, 2023

JulesBelveze commented Jul 24, 2023

JulesBelveze commented Jul 26, 2023

alytarik commented Jul 27, 2023

dcecchini commented Jun 5, 2023 •

edited

Loading