-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Krippendorfs Alpha for position - unplausible results #5305
Comments
INCEpTION uses DKPro Agreement. There is a paper and a couple of presentations about it for introduction:
The implementation is here: If you want to understand it, maybe start looking at that. If you get the correct numbers there, then there might be a bug in the way that INCEpTION calls DKPro Agreement. However, if you already get unexpected numbers in DKPro Agreement, then it might have a bug itself. I have also tried doing a port of TextGamma to DKPro Agreement here: However, so far this port is lacking qualified review and testing to say whether it produces the expected results. If you look at DKPro Agreement's Krippendorf Alpha and/or the Gamma branch, best open issues/comment in that repo. If you find everything to be in order in DKPro Agreement and suspect INCEpTION to be calling it the wrong way, best comment here again. |
What may also help you is the diff export that you can get from the agreement page. In particular you can find the offsets of the positions that are passed to the agreement measure. Also look out for the |
I did a little experiment in INCEpTION in a unit test. Setup 1:
Setup 2:
At least in this little experiment, the agreement degrades when there is overlap match instead of exact match. Code (adjust offsets of user 1 manually to test)
|
Describe the bug
I tried to understand how Krippendorfs Alpha unitizing for position is implemented and annotated a test doc by two annotators. One annotator has 5 annotations, the other just one. If I have 1 span with exact match, the score calculated is 0,4, if I have one span with overlap match (the span differs by one token) I get 0,42. How is this possible?
In fact, I would like to understand better how KA is implemented, especially how the aggreement matrix is calculated, because I observed implausible results for other docs in my corpus as well. Also, I would like to know if it were possible to implement another metric (i.e. gamma) that seems more suitable to deal with overlapping spans.
Please complete the following information:
Thanks!
The text was updated successfully, but these errors were encountered: