Skip to content

Commit c19bede

Browse files
authored
Merge pull request #4 from o19s/add_book_related_notebooks
Still need some work, but at least got them in
2 parents 8d4c592 + c6a0824 commit c19bede

File tree

4 files changed

+752
-1
lines changed

4 files changed

+752
-1
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Browse to http://localhost:8000 and you should see the Jupyterlite interface.
2727

2828
## Development 2
2929

30-
1. Run the docker task, and make the jupyter-lite-build.tgz.
30+
1. Run `docker run -it --rm -e TARGET_DIR=/dist -v "$(pwd)":/dist $(docker build -q .)` producing the jupyter-lite-build.tgz.
3131
1. Unzip it into the ./notebooks
3232
1. `rm -rf public/notebooks` in Quepid
3333
1. Make sure Quepid's docker-compose.override.yml has a line similar to `- /Users/epugh/Documents/projects/quepid-jupyterlite/notebooks:/srv/app/public/notebooks`

jupyterlite/files/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ The example notebooks are stored under ./examples. Feel free to run them, but p
77
* `./examples/Scoring Comparison.ipynb` is an example of measuring relevance score change.
88
* `./examples/Jaccard and RBO Comparison.ipynb` is an example of comparing query result sets to each other.
99

10+
* `./examples/Multiple Raters Analysis.ipynb` looks at how judge compare in rating.
11+
* `./examples/Fleiss Kappa.ipynb` calculates a specific measurement of rater agreement.
12+
1013
These notebooks use data from the Haystack Rating Party.
1114

1215

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
{
2+
"metadata": {
3+
"kernelspec": {
4+
"name": "python",
5+
"display_name": "Python (Pyodide)",
6+
"language": "python"
7+
},
8+
"language_info": {
9+
"codemirror_mode": {
10+
"name": "python",
11+
"version": 3
12+
},
13+
"file_extension": ".py",
14+
"mimetype": "text/x-python",
15+
"name": "python",
16+
"nbconvert_exporter": "python",
17+
"pygments_lexer": "ipython3",
18+
"version": "3.8"
19+
}
20+
},
21+
"nbformat_minor": 5,
22+
"nbformat": 4,
23+
"cells": [
24+
{
25+
"cell_type": "markdown",
26+
"source": "# Fleiss' Kappa \nTo understand how much your raters what? Scott, need some text!\n\nPlease copy this example and customize it for your own purposes!",
27+
"metadata": {},
28+
"id": "bd7e4efa-eb00-451e-984d-ed6646d8e25f"
29+
},
30+
{
31+
"cell_type": "markdown",
32+
"source": "## Imports",
33+
"metadata": {},
34+
"id": "e3412382"
35+
},
36+
{
37+
"cell_type": "code",
38+
"source": "import pandas as pd\nfrom js import fetch\nimport json\n\nfrom collections import defaultdict\nfrom statsmodels.stats.inter_rater import aggregate_raters\nfrom statsmodels.stats.inter_rater import fleiss_kappa\nfrom IPython.display import display, Markdown",
39+
"metadata": {
40+
"trusted": true
41+
},
42+
"execution_count": 1,
43+
"outputs": [],
44+
"id": "4972936a"
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"source": "## Step 0: Configuration",
49+
"metadata": {},
50+
"id": "6da26c5e"
51+
},
52+
{
53+
"cell_type": "code",
54+
"source": "QUEPID_BOOK_NUM = 25\n\n# Not needed if running within Quepid JupyterLite\n# QUEPID_API_TOKEN = \"\"",
55+
"metadata": {
56+
"trusted": true
57+
},
58+
"execution_count": 3,
59+
"outputs": [],
60+
"id": "71803a49-4065-4adf-a69e-cb0fe2d00f22"
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"source": "## Step 1: Download the Quepid Book",
65+
"metadata": {},
66+
"id": "420416df-9e6a-41b4-987b-7a03c9dd38b3"
67+
},
68+
{
69+
"cell_type": "code",
70+
"source": "# Generic GET call to a JSON endpoint \nasync def get_json(url):\n resp = await fetch(url)\n resp_text = await resp.text()\n return json.loads(resp_text)\n\n",
71+
"metadata": {
72+
"trusted": true
73+
},
74+
"execution_count": 4,
75+
"outputs": [],
76+
"id": "31193536-98eb-4b46-ab98-af04ee07c6d3"
77+
},
78+
{
79+
"cell_type": "code",
80+
"source": "data = await get_json(f'/api/export/books/{QUEPID_BOOK_NUM}')",
81+
"metadata": {
82+
"trusted": true
83+
},
84+
"execution_count": 5,
85+
"outputs": [],
86+
"id": "8fef6231-daa8-467f-ac57-13a144e8a356"
87+
},
88+
{
89+
"cell_type": "markdown",
90+
"source": "## Step 2: Extract and Prepare Data",
91+
"metadata": {},
92+
"id": "79d985ad-cd11-44a9-a7e1-0851bc99aef3"
93+
},
94+
{
95+
"cell_type": "code",
96+
"source": "# Initialize a list to hold the tuples of (doc_id, rating, count)\nratings_data = []\n\n# Iterate through each query-doc pair\nfor pair in data['query_doc_pairs']:\n # Initialize a dictionary to count the ratings for this pair\n ratings_count = defaultdict(int)\n \n # Extract judgements and count the ratings\n for judgement in pair['judgements']:\n rating = judgement['rating']\n ratings_count[rating] += 1\n\n # Append the counts to the ratings_data list\n for rating, count in ratings_count.items():\n ratings_data.append((pair['doc_id'], rating, count))\n",
97+
"metadata": {
98+
"trusted": true
99+
},
100+
"execution_count": 6,
101+
"outputs": [],
102+
"id": "9a8561fd-2dbf-477e-9ac1-4df6d5ebdc91"
103+
},
104+
{
105+
"cell_type": "markdown",
106+
"source": "## Step 3: Aggregate Raters' Data",
107+
"metadata": {},
108+
"id": "caf5632b-132a-4e1b-80fe-c8c5ab7f2f3a"
109+
},
110+
{
111+
"cell_type": "code",
112+
"source": "# Convert ratings_data to a DataFrame\ndf = pd.DataFrame(ratings_data, columns=['doc_id', 'rating', 'count'])\n\n# Use crosstab to create a contingency table\ndata_crosstab = pd.crosstab(index=df['doc_id'], columns=df['rating'], values=df['count'], aggfunc='sum')\n\n# Drop any rows missing judgements\ndata_crosstab = data_crosstab.dropna(how='any')\n\n# Convert the DataFrame to the format expected by aggregate_raters\ndata_for_aggregation = data_crosstab.values\n\n# Aggregate the raters' data\ntable, _ = aggregate_raters(data_for_aggregation)",
113+
"metadata": {
114+
"trusted": true
115+
},
116+
"execution_count": 7,
117+
"outputs": [],
118+
"id": "a7598308-129b-4628-ad3a-fc3d703f8205"
119+
},
120+
{
121+
"cell_type": "markdown",
122+
"source": "## Step 4: Compute Fleiss' Kappa",
123+
"metadata": {},
124+
"id": "25c79fbc"
125+
},
126+
{
127+
"cell_type": "code",
128+
"source": "kappa = fleiss_kappa(table, method='fleiss')\ndisplay(Markdown(f\"## Fleiss' Kappa: {kappa:.4f}\"))",
129+
"metadata": {
130+
"trusted": true
131+
},
132+
"execution_count": 8,
133+
"outputs": [
134+
{
135+
"output_type": "display_data",
136+
"data": {
137+
"text/plain": "<IPython.core.display.Markdown object>",
138+
"text/markdown": "## Fleiss' Kappa: -0.3333"
139+
},
140+
"metadata": {}
141+
}
142+
],
143+
"id": "25a613f9"
144+
},
145+
{
146+
"cell_type": "markdown",
147+
"source": "_This notebook was last updated 17-FEB-2024_",
148+
"metadata": {},
149+
"id": "5704579e-2321-4629-8de0-6608b428e2b6"
150+
},
151+
{
152+
"cell_type": "code",
153+
"source": "",
154+
"metadata": {},
155+
"execution_count": null,
156+
"outputs": [],
157+
"id": "7203f6cc-c068-4f75-a59a-1f49c5555319"
158+
}
159+
]
160+
}

0 commit comments

Comments
 (0)