Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mbpp #801

Open
wants to merge 42 commits into
base: eval-hackathon
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
8206ffc
add EN prompts for xcopa
Jul 15, 2022
6af75cb
update ids (2/5)
Jul 15, 2022
1d4d0cc
regenerate ids (4/6)
Jul 15, 2022
e020740
regenerate ids (5/6)
Jul 15, 2022
8511070
regenerate ids (6/6)
Jul 15, 2022
cc65cf3
fix typo
Jul 15, 2022
670e6ad
fix last keyerror?
Jul 15, 2022
f906000
Merge pull request #1 from haileyschoelkopf/xcopa
Muennighoff Jul 16, 2022
c56ce86
Add xwinograd/en
Muennighoff Jul 17, 2022
d164dbf
Duplicate template
Muennighoff Jul 17, 2022
0bfade4
Format
Muennighoff Jul 18, 2022
33ab12e
Merge pull request #2 from bigscience-workshop/eval-hackathon
Muennighoff Jul 18, 2022
c80db69
wip -test
VictorSanh Jul 18, 2022
12ef8de
wip - test
VictorSanh Jul 18, 2022
35e29a8
wip - most stupid
VictorSanh Jul 18, 2022
4d5c500
wip - test
VictorSanh Jul 18, 2022
62ee1a8
de
VictorSanh Jul 18, 2022
1d55c8d
re-clean it - have not make it work yet
VictorSanh Jul 18, 2022
fcc9b8c
Change IDs
Muennighoff Jul 18, 2022
0f93797
Add mbpp
Muennighoff Jul 18, 2022
a27a911
Fix ids
Muennighoff Jul 18, 2022
624878f
Merge branch 'tr13' into muennighoff/xwinogrande
Muennighoff Jul 19, 2022
9a38a4b
Merge pull request #3 from Muennighoff/muennighoff/xwinogrande
Muennighoff Jul 19, 2022
619daa1
Merge pull request #4 from bigscience-workshop/eval-hackathon
Muennighoff Jul 19, 2022
855549f
Merge pull request #5 from Muennighoff/eval-hackathon
Muennighoff Jul 19, 2022
3053aa0
Rmv incompat prompts
Muennighoff Jul 19, 2022
36477c1
Add eng template xcopa
Muennighoff Jul 19, 2022
78dd720
Assimilate en
Muennighoff Jul 19, 2022
ba36b0c
例えば
Muennighoff Jul 19, 2022
d05a9ee
Remove var
Muennighoff Jul 19, 2022
8125d23
Remove dup
Muennighoff Jul 19, 2022
0815263
Merge branch 'tr13' into tatoeba
Muennighoff Jul 19, 2022
baf0b58
Merge pull request #6 from Muennighoff/tatoeba
Muennighoff Jul 19, 2022
be399eb
Swap source & target; Rmv script
Muennighoff Jul 21, 2022
45a3321
t pMerge branch 'tatoeba' of https://github.com/Muennighoff/promptsou…
Muennighoff Jul 21, 2022
96db184
Merge pull request #7 from Muennighoff/tatoeba
Muennighoff Jul 21, 2022
108dc9a
Move mbpp to new ds
Muennighoff Jul 21, 2022
9265d00
Add ds name
Muennighoff Jul 21, 2022
09a3e1d
move to subfolder
Muennighoff Jul 21, 2022
8aeb485
Merge branch 'tr13' into mbpp
Muennighoff Jul 21, 2022
628ba99
text -> prompt
Muennighoff Jul 22, 2022
3e9dda6
Update templates.yaml
Muennighoff Jul 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion promptsource/templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,20 @@
# These are users whose datasets should be included in the results returned by
# filter_english_datasets (regardless of their metadata)

INCLUDED_USERS = {"Zaid", "craffel", "GEM", "aps", "khalidalt", "shanya", "rbawden", "BigScienceBiasEval", "gsarti"}
INCLUDED_USERS = {
"Zaid",
"craffel",
"GEM",
"aps",
"khalidalt",
"shanya",
"rbawden",
"BigScienceBiasEval",
"gsarti",
"Helsinki-NLP",
"Muennighoff",
"facebook",
}

# These are the metrics with which templates can be tagged
METRICS = {
Expand Down
32 changes: 32 additions & 0 deletions promptsource/templates/Muennighoff/mbpp/sanitized/templates.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
dataset: Muennighoff/mbpp
subset: sanitized
templates:
4b108b1c-7514-488f-99ed-3ca5da70e103: !Template
answer_choices: null
id: 4b108b1c-7514-488f-99ed-3ca5da70e103
jinja: '{{ prompt }}
Here is a solution in Python:
|||
{{ code }}'
metadata: !TemplateMetadata
choices_in_prompt: false
languages:
- en
metrics:
- Other
original_task: true
name: function solution
reference: ''
9d85c898-70fe-4a51-be37-5111be357762: !Template
answer_choices: null
id: 9d85c898-70fe-4a51-be37-5111be357762
jinja: "{{ prompt }} This can be solved in Python with the following code: |||{{ code }}"
metadata: !TemplateMetadata
choices_in_prompt: false
languages:
- en
metrics:
- Other
original_task: false
name: function solved
reference: ''
60 changes: 0 additions & 60 deletions promptsource/templates/super_glue/copa/templates.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,21 +22,6 @@ templates:
original_task: true
name: exercise
reference: ''
150789fe-e309-47a1-82c9-0a4dc2c6b12b: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: 150789fe-e309-47a1-82c9-0a4dc2c6b12b
jinja: "{% if question == \"effect\" %} \n{{ premise }} What could happen next,\
\ \"{{ answer_choices[0] }}\" or \"{{ answer_choices[1] }}\"? ||| {% if label\
\ != -1 %}{{ answer_choices[label] }}{%endif%}\n{% endif %}"
metadata: !TemplateMetadata
choices_in_prompt: true
languages:
- en
metrics:
- Accuracy
original_task: true
name: "\u2026What could happen next, C1 or C2?"
reference: ''
4d879cbe-2fd7-424a-9d78-3f5200313fba: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: 4d879cbe-2fd7-424a-9d78-3f5200313fba
Expand Down Expand Up @@ -88,21 +73,6 @@ templates:
original_task: true
name: "C1 or C2? premise, so/because\u2026"
reference: "Adapted from Perez et al. 2021 and Schick & Sch\xFCtz 2021."
84da62c2-9440-4cfc-bdd4-d70c65e33a82: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: 84da62c2-9440-4cfc-bdd4-d70c65e33a82
jinja: "{% if question == \"effect\" %} \n{{ premise }} As a result, \"{{ answer_choices[0]\
\ }}\" or \"{{ answer_choices[1] }}\"? ||| {% if label != -1 %}{{ answer_choices[label]\
\ }}{%endif%}\n{% endif %}"
metadata: !TemplateMetadata
choices_in_prompt: true
languages:
- en
metrics:
- Accuracy
original_task: true
name: "\u2026As a result, C1 or C2?"
reference: ''
8ce80f8a-239e-4393-892c-f63dbb0d9929: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: 8ce80f8a-239e-4393-892c-f63dbb0d9929
Expand All @@ -118,21 +88,6 @@ templates:
original_task: true
name: best_option
reference: ''
8cf2ba73-aee5-4651-b5d4-b1b88afe4abb: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: 8cf2ba73-aee5-4651-b5d4-b1b88afe4abb
jinja: "{% if question == \"cause\" %} \n{{ premise }} Which may be caused by\
\ \"{{ answer_choices[0] }}\" or \"{{ answer_choices[1] }}\"? ||| {% if label\
\ != -1 %}{{ answer_choices[label] }}{%endif%}\n{% endif %}"
metadata: !TemplateMetadata
choices_in_prompt: true
languages:
- en
metrics:
- Accuracy
original_task: true
name: "\u2026which may be caused by"
reference: ''
a1f9951e-2b6b-4530-9636-9cdf4c1658c5: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: a1f9951e-2b6b-4530-9636-9cdf4c1658c5
Expand Down Expand Up @@ -174,21 +129,6 @@ templates:
original_task: true
name: cause_effect
reference: ''
a8bf11c3-bea2-45ba-a533-957d8bee5e2e: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: a8bf11c3-bea2-45ba-a533-957d8bee5e2e
jinja: "{% if question == \"cause\" %} \n{{ premise }} Why? \"{{ answer_choices[0]\
\ }}\" or \"{{ answer_choices[1] }}\"? ||| {% if label != -1 %}{{ answer_choices[label]\
\ }}{%endif%}\n{% endif %}"
metadata: !TemplateMetadata
choices_in_prompt: true
languages:
- en
metrics:
- Accuracy
original_task: true
name: "\u2026why? C1 or C2"
reference: ''
f32348cd-d3cb-4619-87b9-e24f99c78567: !Template
answer_choices: '{{choice1}} ||| {{choice2}}'
id: f32348cd-d3cb-4619-87b9-e24f99c78567
Expand Down