As per the example on hf:
...
"confidence": 0.9999907, // Confidence score reflecting preference reliability, based on annotators' capabilities (independent of choice_dist)
...
It's great to have this score for data filtering purpose, but I'm wondering how did you obtain such a score for each annotator and to what extend is it reliable? There seems no details on this in the paper. Thanks!