-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NetEaseCrowd dataset #101
Add NetEaseCrowd dataset #101
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #101 +/- ##
==========================================
+ Coverage 92.80% 92.96% +0.15%
==========================================
Files 47 47
Lines 2070 2216 +146
==========================================
+ Hits 1921 2060 +139
- Misses 149 156 +7 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @shenxiangzhuang! Thank you for contributing this dataset. Lgtm
Besides the CI test, I also tested to use this dataset do categorical aggregation and it works well: from crowdkit.aggregation import DawidSkene
from crowdkit.datasets import load_dataset
df, gt = load_dataset('netease_crowd')
ds = DawidSkene(10)
result = ds.fit_predict(df)
print(len(result))
# 999799 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for a very well-done PR! I noticed a small imperfection in the dataset metadata. Could you please check my suggestion?
Co-authored-by: Dmitry Ustalov <[email protected]>
Thanks a lot for your carefully review! |
Great job, thank you again! |
Checklist
Dataset info
Adding our open-source dataset, NetEaseCrowd(https://github.com/fuxiAIlab/NetEaseCrowd-Dataset).